Cumulant

class Cumulant

Bases: object

A signal of interest \(z\) accumulated over time

The classical example of a cumulant is a reward from MDP environment. Cumulant is not always maximized; most of the time we are interested in predicting the signal. Cumulants can also be vector valued, i.e. when we want to learn to predict features of the environment.

Some interesting cumulants are the ones that are tracking a metric in the agent itself. In this way we can express intrinsic motivation as a cumulant. For example, we might want to track confidence of the agent in its own prediction by using rolling average \(\TD\) error.

Methods Summary

__call__(self, \*args, \*\*kwargs)

Call self as a function.

Methods Documentation

__call__(self, \*args, \*\*kwargs)

Call self as a function.