Analysis
Functions for analyzing and visualizing (in Jupyter/IPython notebook) logged environment data. The functions for visualizations here provide a basic set of functionality to guide users in understanding the format of the logged environment data. Users are encouraged to generate their own plots and visualizations based on their specific needs.
- conformer_rl.analysis.analysis._load_from_pickle(filename: str) Any
Loads an object from a .pickle file.
- conformer_rl.analysis.analysis.load_data_from_pickle(paths: List[str], indices: Optional[List[str]] = None) dict
Loads saved pickled environment data from multiple runs into a combined data dict.
- Parameters
paths (list of str) – List of paths to .pickle files corresponding to the environment data from the runs of interest.
indices (list of str, optional) – Specifies custom indices/labels to be displayed in generated Seaborn graphs for each run. Should be the same length as paths. If not specified, the labels default to
test0, test1, test2, ....
- Returns
The str corresponds to the key for the data in the original pickled dict object. The list contains the data for each of the environment data sets specified in paths, in the same order they were given in paths.
- Return type
dict mapping from str to list
Notes
The
.picklefiles specified by paths should be dumped directly byEnvLogger, and should correspond to a single evaluation episode. Seeconformer_rl.logging.env_logger.EnvLogger.save_episode()for more details on the dumped format.An example of how the function operates: Suppose that our paths are:
['data1.pickle', 'data2.pickle', 'data3.pickle']
And each pickle object contains corresponding data:
data1 = { 'total_rewards': data1_total_rewards, 'mol': data1_molecule, 'rewards': [data1_step1_rewards, data1_step2_rewards, data1_step3_rewards, data1_step4_rewards] } data2 = { 'total_rewards': data2_total_rewards, 'mol': data2_molecule, 'rewards': [data2_step1_rewards, data2_step2_rewards, data2_step3_rewards, data2_step4_rewards] } data3 = { 'total_rewards': data3_total_rewards, 'mol': data3_molecule, 'rewards': [data3_step1_rewards, data3_step2_rewards, data3_step3_rewards, data3_step4_rewards] }
Suppose that data1 corresponds to some eval data obtained from training with the PPO agent, data2 was obtained from the PPORecurrent agent, and data3 was obtained from training with the A2C agent. Then we can input custom indices to help us understand each dataset better:
indices = ['PPO', 'PPO_recurrent', 'A2C']
Given these data and indices,
load_data_from_pickle()would return the following dict:{ 'indices': ['PPO', 'PPO_recurrent', 'A2C'], 'total_rewards': [ data1_total_rewards, data2_total_rewards, data3_total_rewards ], 'mol': [ data1_molecule, data2_molecule, data3_molecule ], 'rewards': [ [data1_step1_rewards, data1_step2_rewards, data1_step3_rewards, data1_step4_rewards], [data2_step1_rewards, data2_step2_rewards, data2_step3_rewards, data2_step4_rewards], [data3_step1_rewards, data3_step2_rewards, data3_step3_rewards, data3_step4_rewards] ], }
This format consolidates all the data into a single dict and is compatible with the other visualization functions in this module. Furthermore, it is also easy to convert a dict of this format into a Pandas dataframe or other tabular formats if needed.
- conformer_rl.analysis.analysis.list_keys(data: dict) List[str]
Return a list of all keys in a dict.
- Parameters
data (dict) – The dictionary to retrieve keys from.
- conformer_rl.analysis.analysis.bar_plot_episodic(key: str, data: dict) matplotlib.axes._axes.Axes
Plots a bar plot comparing a scalar value across all episodes loaded in data.
- Parameters
key (str) – The key for the values to be compared across all data sets/episodes.
data (dict) – Data dictionary generated by
load_data_from_pickle().
- conformer_rl.analysis.analysis.histogram_select_episodes(key: str, data: dict, episodes: Optional[List[int]] = None, binwidth: float = 10, figsize: Tuple[float, float] = (8.0, 6.0)) matplotlib.axes._axes.Axes
Plots a single histogram where data for each episode in episodes are overlayed.
- Parameters
key (str) – The key for the values to be compared across all data sets/episodes.
data (dict) – Data dictionary generated by
load_data_from_pickle().episodes (list of int, optional) – Specifies the indices in data for the episodes to be shown. If not specified, all episodes are shown.
binwidth (float) – The width of each bin in the histogram.
figsize (2-tuple of float) – Specifies the size of the plot.
- conformer_rl.analysis.analysis.histogram_episodic(key: str, data: dict, binwidth: int = 10, figsize: Tuple[float, float] = (8.0, 6.0)) matplotlib.axes._axes.Axes
Plots histogram on separate axes for each of the episode data sets in data.
- Parameters
key (str) – The key for the values to be compared across all data sets/episodes.
data (dict) – Data dictionary generated by
load_data_from_pickle().binwidth (float) – The width of each bin in the histogram.
figsize (2-tuple of float) – Specifies the size of the plot.
- conformer_rl.analysis.analysis.heatmap_episodic(key: str, data: dict, figsize: Tuple[float, float] = (8.0, 6.0)) matplotlib.axes._axes.Axes
Plots heatmap(s) for matrix data corresponding to key across all episodes loaded in data.
- Parameters
key (str) – The key for the values to be compared across all data sets/episodes.
data (dict) – Data dictionary generated by
load_data_from_pickle().figsize (2-tuple of float) – Specifies the size of the plot.
- conformer_rl.analysis.analysis.calculate_tfd(data: str) None
Updates data with the TFD (Torsion Fingerprint Deviation) matrix (with key ‘tfd_matrix’) and sum of the TFD matrix (with key ‘tfd_total’) for the molecule conformers across each episode loaded in data.
- Parameters
data (dict) – Data dictionary generated by
load_data_from_pickle().
- conformer_rl.analysis.analysis.drawConformer(mol: rdkit.Chem.Mol, confId: int = - 1, size: Tuple[int, int] = (300, 300), style: str = 'stick') py3Dmol.view
Displays interactive 3-dimensional representation of specified conformer.
- Parameters
mol (RDKit Mol object) – The molecule containing the conformer to be displayed.
confId (int) – The ID of the conformer to be displayed.
size (Tuple[int, int]) – The size of the display (width, height).
style (str) – The drawing style for displaying the molecule. Can be sphere, stick, line, cross, cartoon, and surface.
- conformer_rl.analysis.analysis.drawConformer_episodic(data: dict, confIds: List[int], size: Tuple[int, int] = (300, 300), style: str = 'stick') py3Dmol.view
Displays a specified conformer for each episode loaded in data.
- Parameters
data (dict from string to list) – Contains the loaded episode information. ‘mol’ must be a key in data and the corresponding list must contain RDKit Mol objects.
confIds (list of int) – The indices for the conformers to be displayed (for each episode loaded in data).
size (Tuple[int, int]) – The size of the display for each individual molecule (width, height).
style (str) – The drawing style for displaying the molecule. Can be sphere, stick, line, cross, cartoon, and surface.