VPG

class VPG(entropy_coefficient: float = 0.01, *args, **kwargs)

Bases: pandemonium.policies.gradient.DiffPolicy

Vanilla Policy Gradient

Methods Summary

delta(self, features, actions, weights)

dist(self, features, \*args, \*\*kwargs)

Produces a distribution over actions

Methods Documentation

delta(self, features, actions, weights)
dist(self, features, \*args, \*\*kwargs) → torch.distributions.categorical.Categorical

Produces a distribution over actions