Curriculum Conformer_env

class conformer_rl.environments.curriculum_conformer_env.CurriculumConformerEnv(mol_configs: List[conformer_rl.config.mol_config.MolConfig])

Bases: conformer_rl.environments.conformer_env.ConformerEnv

Base interface for building conformer generation environments with support for curriculum learning.

Parameters: mol_configs (list of MolConfig) – List of configuration object specifying the molecules and their corresponding parameters to be trained on as part of the curriculum. The list should be sorted in order of increasing task difficulty.

configs

Configuration objects specifying molecules and corresponding parameters to be used in the environment, in the order of the designated curriculum (ordered from least to most difficult).

Type: list of MolConfig

total_reward

Keeps track of the total reward for the current episode.

Type: float

current_step

Keeps track of the number of elapsed steps in the current episode.

Type: int

step_info

Used for keeping track of data obtained at each step of an episode for logging.

Type: dict from str to list

episode_info

Used for keeping track of data useful at the end of an episode, such as total_reward, for logging.

Type: dict from str to Any

curriculum_max_index

One plus the maximum index in which a molecule/task from the input list of mol_configs can be selected to be trained on. This attribute will be increased as the agent gets better at the current tasks in the curriculum and is ready to move on to more difficult tasks.

Type: int

reset() → object: Resets the environment and returns the observation of the environment.

increase_level(): Updates the curriculum_max_index attribute after obtaining signal from the agent that a favorable reward threshold has been achieved.

decrease_level(): Updates the curriculum_max_index attribute after obtaining signal that the agent is performing poorly on the current curriclum range.