Reward_mixins

Pre-built reward handlers.

class conformer_rl.environments.environment_components.reward_mixins.GibbsRewardMixin

Bases: object

Implements the Gibbs Score reward 1, but distance metric between conformers are judged by whether the conformers were produced by the same action input, instead of with TFD (Torsional Fingerprint Deviation).

References

1(1,2,3,4)

TorsionNet paper

_reward() float

Notes

Logged parameters:

  • energy (float): the energy of the current conformer

  • repeat (int): total number of repeated actions so far in the episode

class conformer_rl.environments.environment_components.reward_mixins.GibbsEndPruningRewardMixin

Bases: object

Implements the Gibbs Score reward 1, except overly similar conformers are only pruned at the end of an episode and therefore is only reflected in the final reward in each episode.

_reward() float

Notes

Logged parameters:

  • energy (float): the energy of the current conformer

class conformer_rl.environments.environment_components.reward_mixins.GibbsPruningRewardMixin

Bases: object

Implements the Gibbs Score reward 1.

_reward() float

Notes

Logged parameters:

  • energy (float): the energy of the current conformer

class conformer_rl.environments.environment_components.reward_mixins.GibbsLogPruningRewardMixin

Bases: conformer_rl.environments.environment_components.reward_mixins.GibbsPruningRewardMixin

Implements the log of the Gibbs Score reward 1.

_reward() float

Notes

Logged parameters:

  • energy (float): the energy of the current conformer