EligibilityTrace

class EligibilityTrace(λ, trace_dim)

Bases: object

Base class for various eligibility traces in \(\TD\) learners.

Eligibility trace is a mechanism for a short-term memory, mathematically represented as a vector, \(e_t \in \mathbb{R}^d\), that parallels the long-term weight vector \(w_t \in \mathbb{R}^d\).

The rough intuition is that when a component of \(w_t\) participates in producing an estimated value, then the corresponding component of \(e_t\) is bumped up and then begins to fade away. Learning will then occur in that component of \(w_t\) if a nonzero \(\TD\) error occurs before the trace falls back to zero.

The trace-decay parameter \(\lambda \mapsto [0, 1]\) determines the rate at which the trace falls.

Methods Summary

__call__(self, \*args, \*\*kwargs)

Call self as a function.

Methods Documentation

__call__(self, \*args, \*\*kwargs)

Call self as a function.