Conformer_env

class conformer_rl.environments.conformer_env.ConformerEnv(mol_config: conformer_rl.config.mol_config.MolConfig)

Base interface for building conformer generation environments.

Parameters

mol_config (MolConfig) – Configuration object specifying molecule and parameters to be used in the environment.

config

Configuration object specifying molecule and parameters to be used in the environment.

Type

MolConfig

total_reward

Keeps track of the total reward for the current episode.

Type

float

current_step

Keeps track of the number of elapsed steps in the current episode.

Type

int

step_info

Used for keeping track of data obtained at each step of an episode for logging.

Type

dict from str to list

episode_info

Used for keeping track of data useful at the end of an episode, such as total_reward, for logging.

Type

dict from str to Any

step(action: Any) Tuple[object, float, bool, dict]

Simulates one iteration of the environment.

Updates the environment with the input action, and calculates the current observation, reward, done, and info.

Parameters

action (Any, depending on implementation of _step()) – The action to be taken by the environment.

Returns

  • obs (Any, depending on implementation of _obs()) – An object reflecting the current configuration of the environment/molecule.

  • reward (float) – The reward calculated given the current configuration of the environment.

  • done (bool) – Whether or not the current episode has finished.

  • info (dict) – Information about the current step and episode of the environment, to be used for logging.

Notes

Logged parameters:

  • reward (float): the reward for the current step

reset() object

Resets the environment and returns the observation of the environment.

_step(action: Any) None

Does not modify molecule.

Notes

Logged parameters:

  • conf: the current generated conformer is saved to the episodic mol object.

_obs() rdkit.Chem.AllChem.rdchem.Mol

Returns the current molecule.

_reward() float

Returns \(e^{-1 * energy}\) where \(energy\) is the energy of the current conformer of the molecule.

Notes

Logged parameters:

  • energy (float): the energy of the current conformer

_done() bool

Returns true if the current number of elapsed steps has exceeded the max number of steps per episode.

_info() Mapping[str, Mapping[str, Any]]

Returns a dict wrapping episode_info and step_info.

Notes

Logged parameters:

  • total_reward (float): total reward of the episode is updated