OfflineTD¶
-
class
OfflineTD
(criterion=<function smooth_l1_loss>, **kwargs)¶ Bases:
pandemonium.demons.demon.Demon
Base class for forward-view \(\TD\) methods.
This class is used as a base for most of the DRL algorithms due to synergy with batching.
Methods Summary
delta
(self, trajectory)Updates a value of a state using information in the trajectory.
target
(self, trajectory, v)Computes discounted returns for each step in the trajectory.
Methods Documentation
-
delta
(self, trajectory: pandemonium.experience.experience.Trajectory) → Tuple[Union[torch.Tensor, NoneType], dict]¶ Updates a value of a state using information in the trajectory.
-
target
(self, trajectory: pandemonium.experience.experience.Trajectory, v: torch.Tensor)¶ Computes discounted returns for each step in the trajectory.
-