Curriculum-Supported Agents
- class conformer_rl.agents.curriculum_agents.ExternalCurriculumAgentMixin(config)
Bases:
objectGeneral mixin class to enable curriculum
Adds functionality to an existing agent for externally interacting with an environment supporting curriculum learning.
- Parameters
config (
Config) – Configuration object for the agent. See notes for a list of config parameters used by this agent.
Notes
In addition to the config parameters required for the base agent class, use of this mixin requires the following additional parameters in the
Configobject:curriculum_agent_buffer_len
curriculum_agent_reward_thresh
curriculum_agent_success_rate
curriculum_agent_fail_rate
- step() None
Performs one iteration of acquiring samples on the environment and then trains on the acquired samples.
- update_curriculum() None
Evaluates the current performance of the agent and signals the environment to increase the level (difficulty) or decrease it depending on the agent’s performance.
The agent is evaluated only when the number of episodes elapsed since the last evaluation has exceeded the parameter
curriculum_agent_buffer_lenassigned in theConfigobject. During the evaluation, the ratio of episodes (out of the lastcurriculum_agent_buffer_lenepisodes) which have a reward exceeding thecurriculum_agent_reward_threshparameter defined in theConfigis calculated. If this ratio exceeds thecurriculum_agent_success_rateparameter, the environment is signaled to increase the difficulty of the curriculum. This is done by calling theincrease_levelmethod of the environment. If the ratio is less than thecurriculum_agent_fail_rateparameter, the environment is told to decrease the difficulty.
- class conformer_rl.agents.curriculum_agents.PPOExternalCurriculumAgent(config)
Bases:
conformer_rl.agents.curriculum_agents.ExternalCurriculumAgentMixin,conformer_rl.agents.PPO.PPO_agent.PPOAgentImplementation of
PPOAgentcompatible with environments that use curriculum learning. Seeupdate_curriculum()for more details.- evaluate() None
Evaluates the agent on the evaluation environment.
Information dict returned by the environment’s
conformer_rl.environments.conformer_env.ConformerEnv.step()method is logged by the eval_logger and saved.
- load(filename: str) None
Loads the neural network with weights.
- Parameters
filename (str) – The path where the neural network weights are saved.
- run_steps() None
Trains the agent.
Trains the agent until the maximum number of steps (specified by config) is reached. Also periodically saves neural network parameters and performs evaluations on the agent, if specified in the config.
- save(filename: str) None
Saves the neural network weights to a file.
- Parameters
filename (str) – The path where the neural network weights are to be saved.
- step() None
Performs one iteration of acquiring samples on the environment and then trains on the acquired samples.
- update_curriculum() None
Evaluates the current performance of the agent and signals the environment to increase the level (difficulty) or decrease it depending on the agent’s performance.
The agent is evaluated only when the number of episodes elapsed since the last evaluation has exceeded the parameter
curriculum_agent_buffer_lenassigned in theConfigobject. During the evaluation, the ratio of episodes (out of the lastcurriculum_agent_buffer_lenepisodes) which have a reward exceeding thecurriculum_agent_reward_threshparameter defined in theConfigis calculated. If this ratio exceeds thecurriculum_agent_success_rateparameter, the environment is signaled to increase the difficulty of the curriculum. This is done by calling theincrease_levelmethod of the environment. If the ratio is less than thecurriculum_agent_fail_rateparameter, the environment is told to decrease the difficulty.
- class conformer_rl.agents.curriculum_agents.PPORecurrentExternalCurriculumAgent(config)
Bases:
conformer_rl.agents.curriculum_agents.ExternalCurriculumAgentMixin,conformer_rl.agents.PPO.PPO_recurrent_agent.PPORecurrentAgentImplementation of
PPORecurrentAgentcompatible with environments that use curriculum learning. Seeupdate_curriculum()for more details.- evaluate() None
Evaluates the agent on the evaluation environment.
Information dict returned by the environment’s
conformer_rl.environments.conformer_env.ConformerEnv.step()method is logged by the eval_logger and saved.
- load(filename: str) None
Loads the neural network with weights.
- Parameters
filename (str) – The path where the neural network weights are saved.
- run_steps() None
Trains the agent.
Trains the agent until the maximum number of steps (specified by config) is reached. Also periodically saves neural network parameters and performs evaluations on the agent, if specified in the config.
- save(filename: str) None
Saves the neural network weights to a file.
- Parameters
filename (str) – The path where the neural network weights are to be saved.
- step() None
Performs one iteration of acquiring samples on the environment and then trains on the acquired samples.
- update_curriculum() None
Evaluates the current performance of the agent and signals the environment to increase the level (difficulty) or decrease it depending on the agent’s performance.
The agent is evaluated only when the number of episodes elapsed since the last evaluation has exceeded the parameter
curriculum_agent_buffer_lenassigned in theConfigobject. During the evaluation, the ratio of episodes (out of the lastcurriculum_agent_buffer_lenepisodes) which have a reward exceeding thecurriculum_agent_reward_threshparameter defined in theConfigis calculated. If this ratio exceeds thecurriculum_agent_success_rateparameter, the environment is signaled to increase the difficulty of the curriculum. This is done by calling theincrease_levelmethod of the environment. If the ratio is less than thecurriculum_agent_fail_rateparameter, the environment is told to decrease the difficulty.
- class conformer_rl.agents.curriculum_agents.A2CExternalCurriculumAgent(config)
Bases:
conformer_rl.agents.curriculum_agents.ExternalCurriculumAgentMixin,conformer_rl.agents.A2C.A2C_agent.A2CAgentImplementation of
A2CAgentcompatible with environments that use curriculum learning. Seeupdate_curriculum()for more details.- evaluate() None
Evaluates the agent on the evaluation environment.
Information dict returned by the environment’s
conformer_rl.environments.conformer_env.ConformerEnv.step()method is logged by the eval_logger and saved.
- load(filename: str) None
Loads the neural network with weights.
- Parameters
filename (str) – The path where the neural network weights are saved.
- run_steps() None
Trains the agent.
Trains the agent until the maximum number of steps (specified by config) is reached. Also periodically saves neural network parameters and performs evaluations on the agent, if specified in the config.
- save(filename: str) None
Saves the neural network weights to a file.
- Parameters
filename (str) – The path where the neural network weights are to be saved.
- step() None
Performs one iteration of acquiring samples on the environment and then trains on the acquired samples.
- update_curriculum() None
Evaluates the current performance of the agent and signals the environment to increase the level (difficulty) or decrease it depending on the agent’s performance.
The agent is evaluated only when the number of episodes elapsed since the last evaluation has exceeded the parameter
curriculum_agent_buffer_lenassigned in theConfigobject. During the evaluation, the ratio of episodes (out of the lastcurriculum_agent_buffer_lenepisodes) which have a reward exceeding thecurriculum_agent_reward_threshparameter defined in theConfigis calculated. If this ratio exceeds thecurriculum_agent_success_rateparameter, the environment is signaled to increase the difficulty of the curriculum. This is done by calling theincrease_levelmethod of the environment. If the ratio is less than thecurriculum_agent_fail_rateparameter, the environment is told to decrease the difficulty.
- class conformer_rl.agents.curriculum_agents.A2CRecurrentExternalCurriculumAgent(config)
Bases:
conformer_rl.agents.curriculum_agents.ExternalCurriculumAgentMixin,conformer_rl.agents.A2C.A2C_recurrent_agent.A2CRecurrentAgentImplementation of
A2CRecurrentAgentcompatible with environments that use curriculum learning. Seeupdate_curriculum()for more details.- evaluate() None
Evaluates the agent on the evaluation environment.
Information dict returned by the environment’s
conformer_rl.environments.conformer_env.ConformerEnv.step()method is logged by the eval_logger and saved.
- load(filename: str) None
Loads the neural network with weights.
- Parameters
filename (str) – The path where the neural network weights are saved.
- run_steps() None
Trains the agent.
Trains the agent until the maximum number of steps (specified by config) is reached. Also periodically saves neural network parameters and performs evaluations on the agent, if specified in the config.
- save(filename: str) None
Saves the neural network weights to a file.
- Parameters
filename (str) – The path where the neural network weights are to be saved.
- step() None
Performs one iteration of acquiring samples on the environment and then trains on the acquired samples.
- update_curriculum() None
Evaluates the current performance of the agent and signals the environment to increase the level (difficulty) or decrease it depending on the agent’s performance.
The agent is evaluated only when the number of episodes elapsed since the last evaluation has exceeded the parameter
curriculum_agent_buffer_lenassigned in theConfigobject. During the evaluation, the ratio of episodes (out of the lastcurriculum_agent_buffer_lenepisodes) which have a reward exceeding thecurriculum_agent_reward_threshparameter defined in theConfigis calculated. If this ratio exceeds thecurriculum_agent_success_rateparameter, the environment is signaled to increase the difficulty of the curriculum. This is done by calling theincrease_levelmethod of the environment. If the ratio is less than thecurriculum_agent_fail_rateparameter, the environment is told to decrease the difficulty.