Egreedy

class Egreedy(epsilon: Union[pandemonium.utilities.schedules.ConstantSchedule, pandemonium.utilities.schedules.LinearSchedule], *args, **kwargs)

Bases: pandemonium.policies.discrete.Discrete

\(\epsilon\)-greedy policy for discrete action spaces.

Picks the optimal action wrt to Q with probability 1 - \(\epsilon\)

Attributes Summary

epsilon

Methods Summary

act(self, \*args, \*\*kwargs)

Samples an action from a distribution over actions

dist(self, features, q_fn)

Creates a categorical distribution with \(\epsilon\)-greedy pmf

Attributes Documentation

epsilon

Methods Documentation

act(self, \*args, \*\*kwargs)

Samples an action from a distribution over actions

dist(self, features, q_fn) → torch.distributions.categorical.Categorical

Creates a categorical distribution with \(\epsilon\)-greedy pmf

Assumes that Q-values are of shape (batch, actions, states)