OfflineTDControl

class OfflineTDControl(**kwargs)

Bases: pandemonium.demons.control.TDControl, pandemonium.demons.offline_td.OfflineTD

Offline \(\TD\) for control tasks.

Methods Summary

delta(self, trajectory)

Specifies the update rule for approximate value function (avf)

Methods Documentation

delta(self, trajectory: pandemonium.experience.experience.Trajectory) → Tuple[Union[torch.Tensor, NoneType], dict]

Specifies the update rule for approximate value function (avf)

Depending on whether the algorithm is online or offline, the demon will be learning from a single Transition vs a Trajectory of experiences.