PER

class PER(size: int, batch_size: int, alpha: float = 0.6, beta: ray.rllib.utils.schedules.schedule.Schedule = ConstantSchedule(0.4), epsilon: float = 1e-06)

Bases: pandemonium.experience.buffers.ER

Prioritized Experience Replay buffer.

References

  • Prioritized Experience Replay (Schaul et al., 2015)

  • Ray RLLib.

Methods Summary

add(self, transition, weight)

sample(self, batch_size, contiguous)

Randomly draws a batch of transitions

update_priorities(self, idxes, priorities)

Update priorities of sampled transitions.

Methods Documentation

add(self, transition: pandemonium.experience.experience.Transition, weight: float = None)
sample(self, batch_size: int = None, contiguous: bool = True) → List[pandemonium.experience.experience.Transition]

Randomly draws a batch of transitions

Parameters
  • batch_size – Number of transitions to sample from the buffer.

  • contiguous – Whether transitions should be contiguous or not. This is particularly useful when using \(n\)-step methods.

update_priorities(self, idxes, priorities)

Update priorities of sampled transitions.

Sets priority of transition at index idxes[i] to priorities[i].

Parameters
  • idxes (List[int]) – List of idxes of sampled transitions

  • priorities (List[float]) – List of updated priorities corresponding to transitions at the sampled idxes denoted by variable idxes.