[[epsilon_greedy_policy]]

📚 node [[epsilon_greedy_policy]]

📓 garden/KGBicheno/Artificial Intelligence/Introduction to AI/Week 3 - Introduction/Definitions/Epsilon_Greedy_Policy.md by @KGBicheno

epsilon greedy policy

Go back to the [[AI Glossary]]

#rl

In reinforcement learning, a policy that either follows a random policy with epsilon probability or a greedy policy otherwise. For example, if epsilon is 0.9, then the policy follows a random policy 90% of the time and a greedy policy 10% of the time.

Over successive episodes, the algorithm reduces epsilon’s value in order to shift from following a random policy to following a greedy policy. By shifting the policy, the agent first randomly explores the environment and then greedily exploits the results of random exploration.

📖 stoas

public document at doc.anagora.org/epsilon_greedy_policy
video call at meet.jit.si/epsilon_greedy_policy

⥱ context

← back
(none)

↑ pushing here
(none)

↓ pulling this
(none)

→ forward
ai glossary

🔎 full text search for 'epsilon_greedy_policy'