Agent¶

class Agent(feature_extractor, behavior_policy: pandemonium.policies.policy.Policy, horde: pandemonium.horde.Horde)¶

Bases: object

Methods Summary

`interact`(self, env, …)	Perform number of steps in the environment.
`learn`(self, env, …)	Learn by interaction with environment.

Methods Documentation

interact(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], s0: torch.Tensor, steps: int) → Tuple[List[ForwardRef(‘Transition’)], torch.Tensor, dict]¶

Perform number of steps in the environment.

Stops early if the episode is over.

Parameters

env – environment in which to take actions
s0 – starting state of the environment
steps – number of steps to interact for

Returns

A list of transitions with a dictionary of logs
.. todo:: – add parallel experience collection

learn(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], episodes: int, horizon: int) → Iterator[dict]¶: Learn by interaction with environment.