Agent

class Agent(feature_extractor, behavior_policy: pandemonium.policies.policy.Policy, horde: pandemonium.horde.Horde)

Bases: object

Methods Summary

interact(self, env, …)

Perform number of steps in the environment.

learn(self, env, …)

Learn by interaction with environment.

Methods Documentation

interact(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], s0: torch.Tensor, steps: int) → Tuple[List[ForwardRef(‘Transition’)], torch.Tensor, dict]

Perform number of steps in the environment.

Stops early if the episode is over.

Parameters
  • env – environment in which to take actions

  • s0 – starting state of the environment

  • steps – number of steps to interact for

Returns

  • A list of transitions with a dictionary of logs

  • .. todo:: – add parallel experience collection

learn(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], episodes: int, horizon: int) → Iterator[dict]

Learn by interaction with environment.