HierarchicalPolicy¶
-
class
HierarchicalPolicy
(option_space: pandemonium.utilities.spaces.OptionSpace)¶ Bases:
pandemonium.policies.discrete.Discrete
A decision rule for discrete option spaces.
In order to produce an action \(a\) this policy picks an option \(ω\) from the space of available options \(Ω\) first. To pick an option, it queries initiation set \(I\) of each of the available options, picking the one that has the highest score. It then it uses the internal policy \(π\) of the chosen option to produce the action that will be made by the agent.
Todo
Currently starts with a random option from the space. Maybe wait for initial state to initialize the option instead, then pick the one with the highest interest
Todo
explore the interplay between initiation of one option and termination of another
Todo
might be better to move the OptionSpace into this file since it is discrete at the moment
Methods Summary
act
(self, state, vf)Samples an action from a distribution over actions
dist
(self, \*args, \*\*kwargs)Produces a distribution over actions
Methods Documentation
-
act
(self, state, vf)¶ Samples an action from a distribution over actions
-
dist
(self, \*args, \*\*kwargs) → torch.distributions.distribution.Distribution¶ Produces a distribution over actions
-