DoubleQLearning¶

class DoubleQLearning(**kwargs)¶

Bases: pandemonium.implementations.q_learning.OnlineQLearning

Implements online version of Double Q-learning.

Methods Summary

delta(self, t)

Specifies the update rule for approximate value function (avf)

Methods Documentation

delta(self, t: pandemonium.experience.experience.Transition) → Tuple[Union[torch.Tensor, NoneType], dict]¶

Specifies the update rule for approximate value function (avf)

Depending on whether the algorithm is online or offline, the demon will be learning from a single Transition vs a Trajectory of experiences.