ICM

class ICM(feature: callable, behavior_policy: pandemonium.policies.policy.Policy, beta: float)

Bases: pandemonium.demons.demon.ParametricDemon

Intrinsic Curiosity Module

References

“Curiosity-driven Exploration by Self-supervised Prediction”

Pathak et al. 2017 https://arxiv.org/pdf/1705.05363.pdf

Methods Summary

delta(self, experience, ForwardRef])

Specifies the update rule for approximate value function (avf)

Methods Documentation

delta(self, experience: Union[ForwardRef(‘Transition’), ForwardRef(‘Trajectory’)]) → Tuple[Union[torch.Tensor, NoneType], dict]

Specifies the update rule for approximate value function (avf)

Depending on whether the algorithm is online or offline, the demon will be learning from a single Transition vs a Trajectory of experiences.