Reward_mixins
Pre-built reward handlers.
- class conformer_rl.environments.environment_components.reward_mixins.GibbsRewardMixin
Bases:
objectImplements the Gibbs Score reward 1, but distance metric between conformers are judged by whether the conformers were produced by the same action input, instead of with TFD (Torsional Fingerprint Deviation).
References
- _reward() float
Notes
Logged parameters:
energy (float): the energy of the current conformer
repeat (int): total number of repeated actions so far in the episode
- class conformer_rl.environments.environment_components.reward_mixins.GibbsEndPruningRewardMixin
Bases:
objectImplements the Gibbs Score reward 1, except overly similar conformers are only pruned at the end of an episode and therefore is only reflected in the final reward in each episode.
- _reward() float
Notes
Logged parameters:
energy (float): the energy of the current conformer
- class conformer_rl.environments.environment_components.reward_mixins.GibbsPruningRewardMixin
Bases:
objectImplements the Gibbs Score reward 1.
- _reward() float
Notes
Logged parameters:
energy (float): the energy of the current conformer
- class conformer_rl.environments.environment_components.reward_mixins.GibbsLogPruningRewardMixin
Bases:
conformer_rl.environments.environment_components.reward_mixins.GibbsPruningRewardMixinImplements the log of the Gibbs Score reward 1.
- _reward() float
Notes
Logged parameters:
energy (float): the energy of the current conformer