Approximators

MushroomRL exposes the high-level class Regressor that can manage any type of function regressor. This class is a wrapper for any kind of function approximator, e.g. a scikit-learn approximator or a pytorch neural network.

Regressor

class mushroom_rl.approximators.regressor.Regressor(approximator, input_shape, output_shape=(1, ), n_actions=None, n_models=1, **params)[source]

Bases: mushroom_rl.core.serialization.Serializable

This class implements the function to manage a function approximator. This class selects the appropriate kind of regressor to implement according to the parameters provided by the user; this makes this class the only one to use for each kind of task that has to be performed. The inference of the implementation to choose is done checking the provided values of parameters n_actions. If n_actions is provided, it means that the user wants to implement an approximator of the Q-function: if the value of n_actions is equal to the output_shape then a QRegressor is created, else (output_shape should be (1,)) an ActionRegressor is created. Otherwise a GenericRegressor is created. An Ensemble model can be used for all the previous implementations listed before simply providing a n_models parameter greater than 1.

__init__(approximator, input_shape, output_shape=(1, ), n_actions=None, n_models=1, **params)[source]

Constructor.

Parameters:
  • approximator (class) – the approximator class to use to create the model;
  • input_shape (tuple) – the shape of the input of the model;
  • output_shape (tuple, (1,)) – the shape of the output of the model;
  • n_actions (int, None) – number of actions considered to create a QRegressor or an ActionRegressor;
  • n_models (int, 1) – number of models to create;
  • **params – other parameters to create each model.
__call__(*z, **predict_params)[source]

Call self as a function.

fit(*z, **fit_params)[source]

Fit the model.

Parameters:
  • *z – list of input of the model;
  • **fit_params – parameters to use to fit the model.
predict(*z, **predict_params)[source]

Predict the output of the model given an input.

Parameters:
  • *z – list of input of the model;
  • **predict_params – parameters to use to predict with the model.
Returns:

The model prediction.

model

The model object.

Type:Returns
reset()[source]

Reset the model parameters.

input_shape

The shape of the input of the model.

Type:Returns
output_shape

The shape of the output of the model.

Type:Returns
weights_size

The shape of the weights of the model.

Type:Returns
get_weights()[source]
Returns:The weights of the model.
set_weights(w)[source]
Parameters:w (list) – list of weights to be set in the model.
diff(*z)[source]
Parameters:*z – the input of the model.
Returns:The derivative of the model.
_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent.

Parameters:**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.
_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:A deepcopy of the agent.
classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:path (Path, string) – Relative or absolute path to the agents save location.
Returns:The loaded agent.
save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;
  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.
save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;
  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;
  • folder (string, '') – subfolder to be used by the save method.

Approximator

Linear

class mushroom_rl.approximators.parametric.linear.LinearApproximator(weights=None, input_shape=None, output_shape=(1, ), **kwargs)[source]

Bases: mushroom_rl.core.serialization.Serializable

This class implements a linear approximator.

__init__(weights=None, input_shape=None, output_shape=(1, ), **kwargs)[source]

Constructor.

Parameters:
  • weights (np.ndarray) – array of weights to initialize the weights of the approximator;
  • input_shape (np.ndarray, None) – the shape of the input of the model;
  • output_shape (np.ndarray, (1,)) – the shape of the output of the model;
  • **kwargs – other params of the approximator.
fit(x, y, **fit_params)[source]

Fit the model.

Parameters:
  • x (np.ndarray) – input;
  • y (np.ndarray) – target;
  • **fit_params – other parameters used by the fit method of the regressor.
predict(x, **predict_params)[source]

Predict.

Parameters:
  • x (np.ndarray) – input;
  • **predict_params – other parameters used by the predict method the regressor.
Returns:

The predictions of the model.

weights_size

The size of the array of weights.

Type:Returns
get_weights()[source]

Getter.

Returns:The set of weights of the approximator.
set_weights(w)[source]

Setter.

Parameters:w (np.ndarray) – the set of weights to set.
_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent.

Parameters:**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.
_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:A deepcopy of the agent.
diff(state, action=None)[source]

Compute the derivative of the output w.r.t. state, and action if provided.

Parameters:
  • state (np.ndarray) – the state;
  • action (np.ndarray, None) – the action.
Returns:

The derivative of the output w.r.t. state, and action if provided.

classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:path (Path, string) – Relative or absolute path to the agents save location.
Returns:The loaded agent.
save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;
  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.
save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;
  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;
  • folder (string, '') – subfolder to be used by the save method.

Torch Approximator

class mushroom_rl.approximators.parametric.torch_approximator.TorchApproximator(input_shape, output_shape, network, optimizer=None, loss=None, batch_size=0, n_fit_targets=1, use_cuda=False, reinitialize=False, dropout=False, quiet=True, **params)[source]

Bases: mushroom_rl.core.serialization.Serializable

Class to interface a pytorch model to the mushroom Regressor interface. This class implements all is needed to use a generic pytorch model and train it using a specified optimizer and objective function. This class supports also minibatches.

__init__(input_shape, output_shape, network, optimizer=None, loss=None, batch_size=0, n_fit_targets=1, use_cuda=False, reinitialize=False, dropout=False, quiet=True, **params)[source]

Constructor.

Parameters:
  • input_shape (tuple) – shape of the input of the network;
  • output_shape (tuple) – shape of the output of the network;
  • network (torch.nn.Module) – the network class to use;
  • optimizer (dict) – the optimizer used for every fit step;
  • loss (torch.nn.functional) – the loss function to optimize in the fit method;
  • batch_size (int, 0) – the size of each minibatch. If 0, the whole dataset is fed to the optimizer at each epoch;
  • n_fit_targets (int, 1) – the number of fit targets used by the fit method of the network;
  • use_cuda (bool, False) – if True, runs the network on the GPU;
  • reinitialize (bool, False) – if True, the approximator is re
  • at every fit call. To perform the initialization, the (initialized) –
  • method must be defined properly for the selected (weights_init) –
  • network. (model) –
  • dropout (bool, False) – if True, dropout is applied only during train;
  • quiet (bool, True) – if False, shows two progress bars, one for epochs and one for the minibatches;
  • **params – dictionary of parameters needed to construct the network.
predict(*args, output_tensor=False, **kwargs)[source]

Predict.

Parameters:
  • *args – input;
  • output_tensor (bool, False) – whether to return the output as tensor or not;
  • **kwargs – other parameters used by the predict method the regressor.
Returns:

The predictions of the model.

fit(*args, n_epochs=None, weights=None, epsilon=None, patience=1, validation_split=1.0, **kwargs)[source]

Fit the model.

Parameters:
  • *args – input, where the last n_fit_targets elements are considered as the target, while the others are considered as input;
  • n_epochs (int, None) – the number of training epochs;
  • weights (np.ndarray, None) – the weights of each sample in the computation of the loss;
  • epsilon (float, None) – the coefficient used for early stopping;
  • patience (float, 1.) – the number of epochs to wait until stop the learning if not improving;
  • validation_split (float, 1.) – the percentage of the dataset to use as training set;
  • **kwargs – other parameters used by the fit method of the regressor.
set_weights(weights)[source]

Setter.

Parameters:w (np.ndarray) – the set of weights to set.
get_weights()[source]

Getter.

Returns:The set of weights of the approximator.
weights_size

The size of the array of weights.

Type:Returns
diff(*args, **kwargs)[source]

Compute the derivative of the output w.r.t. state, and action if provided.

Parameters:
  • state (np.ndarray) – the state;
  • action (np.ndarray, None) – the action.
Returns:

The derivative of the output w.r.t. state, and action if provided.

_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent.

Parameters:**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.
_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:A deepcopy of the agent.
classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:path (Path, string) – Relative or absolute path to the agents save location.
Returns:The loaded agent.
save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;
  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.
save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;
  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;
  • folder (string, '') – subfolder to be used by the save method.