Key Dates

Volume 11 Issue 4

Last Date of Paper Submission
May 09, 2025

Review Report (Faster Online Peer Review)
Within 3-4 Days after Submission

Publication (online)
Within 1-2 Days After Registration

Indexing and Certificate Delivery
After 7 Days of Last Date of Publication

PIF Impact Factor

Facebook Page

Youtube Channel

Blog

Ploska Bibliografia Naukowa

The Epsilon Greedy Algorithm - a Performance Review

( Volume 6 issue 9,September 2020 ) OPEN ACCESS

Author(s):

Riti Agarwal

Keywords:

exploration, exploitation, regret, reward function, local maxima.

Abstract:

Multi-Armed Bandit (MAB) is a class of reinforcement learning algorithms. A multi-armed bandit implementation has a agent (learner) that chooses between k different uncertain actions and receives a reward based on the chosen action. This paper focuses mainly on the Epsilon Greedy Algorithm in comparison to Thompson Sampling and UCB-1 (Upper Confidence Bound). It talks about the benefits of using bandit algorithms over A/B testing and evaluates the effectiveness of the 3 main solutions. It experimentally shows the best use cases for the Epsilon Greedy Algorithm - when the experimentation period is longer than that of A/B testing and you want to exploit the best performing variant. It also talks about when the algorithm does not provide statistically correct results - when the sample size, on each path of the experiment, is very small.

Paper Statistics:

Cite this Article:

Click here to get all Styles of Citation using DOI of the article.

International Journal of New Technology and Research

Impact Factor 3.953

The Epsilon Greedy Algorithm - a Performance Review

Riti Agarwal