Agent¶
-
class
Agent
(feature_extractor, behavior_policy: pandemonium.policies.policy.Policy, horde: pandemonium.horde.Horde)¶ Bases:
object
Methods Summary
interact
(self, env, …)Perform number of steps in the environment.
learn
(self, env, …)Learn by interaction with environment.
Methods Documentation
-
interact
(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], s0: torch.Tensor, steps: int) → Tuple[List[ForwardRef(‘Transition’)], torch.Tensor, dict]¶ Perform number of steps in the environment.
Stops early if the episode is over.
- Parameters
env – environment in which to take actions
s0 – starting state of the environment
steps – number of steps to interact for
- Returns
A list of transitions with a dictionary of logs
.. todo:: – add parallel experience collection
-
learn
(self, env: Union[gym_minigrid.minigrid.MiniGridEnv, pandemonium.envs.dm_lab.dm_env.DeepmindLabEnv], episodes: int, horizon: int) → Iterator[dict]¶ Learn by interaction with environment.
-