How to use the Logger
Here we explain in detail the usage of the MushroomRL Logger class. This class can be used as a standardized console logger and can also log on disk Numpy arrays or a mushroom agent, using the appropriate logging folder.
Constructing the Logger
To initialize the logger we can simply choose a log directory and an experiment name:
from mushroom_rl.core import Logger
# Create a logger object, creating a log folder
logger = Logger('tutorial', results_dir='/tmp/logs',
log_console=True)
This will create the experiment folder named ‘tutorial’ inside the base folder ‘/tmp/logs’.
The logger creates all the necessary directories if they do not exist.
If results_dir
is not specified, the log will create a ‘./logs’ base directory.
By setting log_console
to true, the logger will store the console output in a ‘.log’ text file inside the experiment folder, with the same name.
If the file already exists, the logger will append the new logged lines.
If you do not want the logger to create any directory e.g., to only use the log for the console
output, you can force the results_dir
parameter to None:
# Create a logger object, without creating the log folder
logger_no_folder = Logger('tutorial_no_folder', results_dir=None)
Logging message on the console
The most basic functionality of the Logger is to output text messages on the standard output. Our logger uses the standard Python logger, and it follows a similar set of functionalities:
# Write a line of hashtags, to be used as a separator
logger.strong_line()
# Print an info message
logger.debug('This is a debug message')
# Print an info message
logger.info('This is an info message')
# Print a warning
logger.warning('This is a warning message')
# Print an error
logger.error('This is an error message')
# Print a critical error message
logger.critical('This is a critical error')
# Print a line of dashes, to be used as a (weak) separator
logger.weak_line()
We can also log to terminal the exceptions. Using this method, instead of a raw print, you can manage
correctly the exception output without breaking any tqdm
progress bar (see below), and the exception
text will be saved in the console log files (if console logging is active).
# Exception logging
try:
raise RuntimeError('A runtime exception occurred')
except RuntimeError as e:
logger.error('Exception catched, here\'s the stack trace:')
logger.exception(e)
logger.weak_line()
Logging a Reinforcement Learning experiment
Our Logger includes some functionalities to log RL experiment data easily. To demonstrate this, we will set up a simple RL experiment, using Q-Learning in the simple chain enviornment.
# Logging learning process
from mushroom_rl.core import Core
from mushroom_rl.environments.generators import generate_simple_chain
from mushroom_rl.policy import EpsGreedy
from mushroom_rl.algorithms.value import QLearning
from mushroom_rl.rl_utils.parameters import Parameter
from tqdm import trange
from time import sleep
import numpy as np
# Setup simple learning environment
mdp = generate_simple_chain(state_n=5, goal_states=[2], prob=.8, rew=1, gamma=.9)
epsilon = Parameter(value=.15)
pi = EpsGreedy(epsilon=epsilon)
agent = QLearning(mdp.info, pi, learning_rate=Parameter(value=.2))
core = Core(agent, mdp)
epochs = 10
We skip the details of this RL experiment, as they are not relevant to the current tutorial. You can have a deeper look at RL experiments with MushroomRL in other tutorials.
It is important to notice that we use tqdm
progress bar, as our logger is integrated with
this package, and can print log messages while the progress bar is showing progress, without
disrupting the progress bar and the terminal.
We first print the learning performances before the learning, using the epoch_info
method:
logger.info('Experiment started')
logger.strong_line()
dataset = core.evaluate(n_steps=100)
J = np.mean(dataset.discounted_return)
R = np.mean(dataset.undiscounted_return) # Undiscounted returns
logger.epoch_info(0, J=J, R=R, any_label='any value')
Notice that this method can print any possible label passed as a function parameter, so it’s not
restricted to J
, R
, or other predefined metrics.
We now consider the learning loop:
# Here some learning
core.learn(n_steps=100, n_steps_per_fit=1)
sleep(0.5)
dataset = core.evaluate(n_steps=100)
sleep(0.5)
J = np.mean(dataset.discounted_return) # Discounted returns
R = np.mean(dataset.undiscounted_return) # Undiscounted returns
# Here logging epoch results to the console
logger.epoch_info(i+1, J=J, R=R)
# Logging the data in J.npy and E.npy
logger.log_numpy(J=J, R=R)
# Logging the best agent according to the best J
logger.log_best_agent(agent, J)
Here we make use of both the epoch_info
method to log the data in the console output and the methods
log_numpy
and log_best_agent
to log the learning progress.
The log_numpy
method can take an arbitrary value (primitive or a NumPy array) and log into a single NumPy array (or matrix). Again a set of arbitrary keywords can be used to save data into different filenames.
If the seed
parameter of the constructor of the Logger class is specified, the filename will include
a postfix with the seed. This is useful when multiple runs of the same experiment are executed.
The log_best_agent
saves the current agent, into the ‘agent-best.msh’ file. However, the current agent will
be stored on disk only if it improves w.r.t. the previously logged one.
We conclude the learning experiment by logging the final agent and the last dataset:
logger.log_agent(agent)
# Log the last dataset
logger.log_dataset(dataset)
logger.info('Experiment terminated')
Advanced Logger topics
The logger can be also used to continue the learning from a previously existing run, without overwriting the
stored results values. This can be done by specifying the append
flag in the logger’s constructor.
del logger # Delete previous logger
new_logger = Logger('tutorial', results_dir='/tmp/logs',
log_console=True, append=True)
# add infinite at the end of J.npy
new_logger.log_numpy(J=np.inf)
new_logger.info('Tutorial ends here')
Finally, another functionality of the logger is to activate some specific text output from some algorithms.
This can be done by calling the agent’s set_logger
method:
agent.set_logger(logger)
Currently, only the PPO
and the TRPO
algorithms provide additional output, by describing some
learning metrics after every fit.