Curriculum Conformer_env

class conformer_rl.environments.curriculum_conformer_env.CurriculumConformerEnv(mol_configs: List[conformer_rl.config.mol_config.MolConfig])

Bases: conformer_rl.environments.conformer_env.ConformerEnv

Base interface for building conformer generation environments with support for curriculum learning.

Parameters

mol_configs (list of MolConfig) – List of configuration object specifying the molecules and their corresponding parameters to be trained on as part of the curriculum. The list should be sorted in order of increasing task difficulty.

configs

Configuration objects specifying molecules and corresponding parameters to be used in the environment, in the order of the designated curriculum (ordered from least to most difficult).

Type

list of MolConfig

total_reward

Keeps track of the total reward for the current episode.

Type

float

current_step

Keeps track of the number of elapsed steps in the current episode.

Type

int

step_info

Used for keeping track of data obtained at each step of an episode for logging.

Type

dict from str to list

episode_info

Used for keeping track of data useful at the end of an episode, such as total_reward, for logging.

Type

dict from str to Any

curriculum_max_index

One plus the maximum index in which a molecule/task from the input list of mol_configs can be selected to be trained on. This attribute will be increased as the agent gets better at the current tasks in the curriculum and is ready to move on to more difficult tasks.

Type

int

reset() object

Resets the environment and returns the observation of the environment.

increase_level()

Updates the curriculum_max_index attribute after obtaining signal from the agent that a favorable reward threshold has been achieved.

decrease_level()

Updates the curriculum_max_index attribute after obtaining signal that the agent is performing poorly on the current curriclum range.