How to use the Logger

Here we explain in detail the usage of the MushroomRL Logger class. This class can be used as a standardized console logger and can also log on disk Numpy arrays or a mushroom agent, using the appropriate logging folder.

Constructing the Logger

To initialize the logger we can simply choose a log directory and an experiment name:

from mushroom_rl.core import Logger

# Create a logger object, creating a log folder
logger = Logger('tutorial', results_dir='/tmp/logs',
                log_console=True)

This will create the experiment folder named ‘tutorial’ inside the base folder ‘/tmp/logs’. The logger creates all the necessary directories if they do not exist. If results_dir is not specified, the log will create a ‘./logs’ base directory. By setting log_console to true, the logger will store the console output in a ‘.log’ text file inside the experiment folder, with the same name. If the file already exists, the logger will append the new logged lines.

If you do not want the logger to create any directory e.g., to only use the log for the console output, you can force the results_dir parameter to None:

# Create a logger object, without creating the log folder
logger_no_folder = Logger('tutorial_no_folder', results_dir=None)

Logging message on the console

The most basic functionality of the Logger is to output text messages on the standard output. Our logger uses the standard Python logger, and it follows a similar set of functionalities:

# Write a line of hashtags, to be used as a separator
logger.strong_line()

# Print an info message
logger.debug('This is a debug message')

# Print an info message
logger.info('This is an info message')

# Print a warning
logger.warning('This is a warning message')

# Print an error
logger.error('This is an error message')

# Print a critical error message
logger.critical('This is a critical error')

# Print a line of dashes, to be used as a (weak) separator
logger.weak_line()

We can also log to terminal the exceptions. Using this method, instead of a raw print, you can manage correctly the exception output without breaking any tqdm progress bar (see below), and the exception text will be saved in the console log files (if console logging is active).

# Exception logging
try:
    raise RuntimeError('A runtime exception occurred')
except RuntimeError as e:
    logger.error('Exception catched, here\'s the stack trace:')
    logger.exception(e)

logger.weak_line()

Logging a Reinforcement Learning experiment

Our Logger includes some functionalities to log RL experiment data easily. To demonstrate this, we will set up a simple RL experiment, using Q-Learning in the simple chain enviornment.

# Logging learning process
from mushroom_rl.core import Core
from mushroom_rl.environments.generators import generate_simple_chain
from mushroom_rl.policy import EpsGreedy
from mushroom_rl.algorithms.value import QLearning
from mushroom_rl.rl_utils.parameters import Parameter
from tqdm import trange
from time import sleep
import numpy as np


# Setup simple learning environment
mdp = generate_simple_chain(state_n=5, goal_states=[2], prob=.8, rew=1, gamma=.9)
epsilon = Parameter(value=.15)
pi = EpsGreedy(epsilon=epsilon)
agent = QLearning(mdp.info, pi, learning_rate=Parameter(value=.2))
core = Core(agent, mdp)
epochs = 10

We skip the details of this RL experiment, as they are not relevant to the current tutorial. You can have a deeper look at RL experiments with MushroomRL in other tutorials.

It is important to notice that we use tqdm progress bar, as our logger is integrated with this package, and can print log messages while the progress bar is showing progress, without disrupting the progress bar and the terminal.

We first print the learning performances before the learning, using the epoch_info method:

logger.info('Experiment started')
logger.strong_line()

dataset = core.evaluate(n_steps=100)
J = np.mean(dataset.discounted_return)
R = np.mean(dataset.undiscounted_return)  # Undiscounted returns

logger.epoch_info(0, J=J, R=R, any_label='any value')

Notice that this method can print any possible label passed as a function parameter, so it’s not restricted to J, R, or other predefined metrics.

We now consider the learning loop:

    # Here some learning
    core.learn(n_steps=100, n_steps_per_fit=1)
    sleep(0.5)
    dataset = core.evaluate(n_steps=100)
    sleep(0.5)
    J = np.mean(dataset.discounted_return)  # Discounted returns
    R = np.mean(dataset.undiscounted_return)  # Undiscounted returns

    # Here logging epoch results to the console
    logger.epoch_info(i+1, J=J, R=R)

    # Logging the data in J.npy and E.npy
    logger.log_numpy(J=J, R=R)

    # Logging the best agent according to the best J
    logger.log_best_agent(agent, J)

Here we make use of both the epoch_info method to log the data in the console output and the methods log_numpy and log_best_agent to log the learning progress.

The log_numpy method can take an arbitrary value (primitive or a NumPy array) and log into a single NumPy array (or matrix). Again a set of arbitrary keywords can be used to save data into different filenames. If the seed parameter of the constructor of the Logger class is specified, the filename will include a postfix with the seed. This is useful when multiple runs of the same experiment are executed.

The log_best_agent saves the current agent, into the ‘agent-best.msh’ file. However, the current agent will be stored on disk only if it improves w.r.t. the previously logged one.

We conclude the learning experiment by logging the final agent and the last dataset:

logger.log_agent(agent)

# Log the last dataset
logger.log_dataset(dataset)

logger.info('Experiment terminated')

Advanced Logger topics

The logger can be also used to continue the learning from a previously existing run, without overwriting the stored results values. This can be done by specifying the append flag in the logger’s constructor.

del logger  # Delete previous logger
new_logger = Logger('tutorial', results_dir='/tmp/logs',
                    log_console=True, append=True)

# add infinite at the end of J.npy
new_logger.log_numpy(J=np.inf)
new_logger.info('Tutorial ends here')

Finally, another functionality of the logger is to activate some specific text output from some algorithms. This can be done by calling the agent’s set_logger method:

agent.set_logger(logger)

Currently, only the PPO and the TRPO algorithms provide additional output, by describing some learning metrics after every fit.