Approximators¶
MushroomRL exposes the highlevel class Regressor
that can manage any type of
function regressor. This class is a wrapper for any kind of function
approximator, e.g. a scikitlearn approximator or a pytorch neural network.
Regressor¶

class
mushroom_rl.approximators.regressor.
Regressor
(approximator, input_shape, output_shape=(1, ), n_actions=None, n_models=1, **params)[source]¶ Bases:
object
This class implements the function to manage a function approximator. This class selects the appropriate kind of regressor to implement according to the parameters provided by the user; this makes this class the only one to use for each kind of task that has to be performed. The inference of the implementation to choose is done checking the provided values of parameters
n_actions
. Ifn_actions
is provided, it means that the user wants to implement an approximator of the Qfunction: if the value ofn_actions
is equal to theoutput_shape
then aQRegressor
is created, else (output_shape
should be (1,)) anActionRegressor
is created. Otherwise aGenericRegressor
is created. AnEnsemble
model can be used for all the previous implementations listed before simply providing an_models
parameter greater than 1.
__init__
(approximator, input_shape, output_shape=(1, ), n_actions=None, n_models=1, **params)[source]¶ Constructor.
Parameters:  approximator (object) – the approximator class to use to create the model;
 input_shape (tuple) – the shape of the input of the model;
 output_shape (tuple, (1,)) – the shape of the output of the model;
 n_actions (int, None) – number of actions considered to create a
QRegressor
or anActionRegressor
;  n_models (int, 1) – number of models to create;
 **params (dict) – other parameters to create each model.

fit
(*z, **fit_params)[source]¶ Fit the model.
Parameters:  *z (list) – list of input of the model;
 **fit_params (dict) – parameters to use to fit the model.

predict
(*z, **predict_params)[source]¶ Predict the output of the model given an input.
Parameters:  *z (list) – list of input of the model;
 **predict_params (dict) – parameters to use to predict with the model.
Returns: The model prediction.

model
¶ The model object.
Type: Returns

input_shape
¶ The shape of the input of the model.
Type: Returns

output_shape
¶ The shape of the output of the model.
Type: Returns

weights_size
¶ The shape of the weights of the model.
Type: Returns

Approximator¶
Linear¶

class
mushroom_rl.approximators.parametric.linear.
LinearApproximator
(weights=None, input_shape=None, output_shape=(1, ), **kwargs)[source]¶ Bases:
object
This class implements a linear approximator.

__init__
(weights=None, input_shape=None, output_shape=(1, ), **kwargs)[source]¶ Constructor.
Parameters:  weights (np.ndarray) – array of weights to initialize the weights of the approximator;
 input_shape (np.ndarray, None) – the shape of the input of the model;
 output_shape (np.ndarray, (1,)) – the shape of the output of the model;
 **kwargs (dict) – other params of the approximator.

fit
(x, y, **fit_params)[source]¶ Fit the model.
Parameters:  x (np.ndarray) – input;
 y (np.ndarray) – target;
 **fit_params (dict) – other parameters used by the fit method of the regressor.

predict
(x, **predict_params)[source]¶ Predict.
Parameters:  x (np.ndarray) – input;
 **predict_params (dict) – other parameters used by the predict method the regressor.
Returns: The predictions of the model.

weights_size
¶ The size of the array of weights.
Type: Returns

Torch Approximator¶

class
mushroom_rl.approximators.parametric.torch_approximator.
TorchApproximator
(input_shape, output_shape, network, optimizer=None, loss=None, batch_size=0, n_fit_targets=1, use_cuda=False, reinitialize=False, dropout=False, quiet=True, **params)[source]¶ Bases:
object
Class to interface a pytorch model to the mushroom Regressor interface. This class implements all is needed to use a generic pytorch model and train it using a specified optimizer and objective function. This class supports also minibatches.

__init__
(input_shape, output_shape, network, optimizer=None, loss=None, batch_size=0, n_fit_targets=1, use_cuda=False, reinitialize=False, dropout=False, quiet=True, **params)[source]¶ Constructor.
Parameters:  input_shape (tuple) – shape of the input of the network;
 output_shape (tuple) – shape of the output of the network;
 network (torch.nn.Module) – the network class to use;
 optimizer (dict) – the optimizer used for every fit step;
 loss (torch.nn.functional) – the loss function to optimize in the fit method;
 batch_size (int, 0) – the size of each minibatch. If 0, the whole dataset is fed to the optimizer at each epoch;
 n_fit_targets (int, 1) – the number of fit targets used by the fit method of the network;
 use_cuda (bool, False) – if True, runs the network on the GPU;
 reinitialize (bool, False) – if True, the approximator is re
 at every fit call. To perform the initialization, the (initialized) –
 method must be defined properly for the selected (weights_init) –
 network. (model) –
 dropout (bool, False) – if True, dropout is applied only during train;
 quiet (bool, True) – if False, shows two progress bars, one for epochs and one for the minibatches;
 params (dict) – dictionary of parameters needed to construct the network.

predict
(*args, output_tensor=False, **kwargs)[source]¶ Predict.
Parameters:  args (list) – input;
 output_tensor (bool, False) – whether to return the output as tensor or not;
 **kwargs (dict) – other parameters used by the predict method the regressor.
Returns: The predictions of the model.

fit
(*args, n_epochs=None, weights=None, epsilon=None, patience=1, validation_split=1.0, **kwargs)[source]¶ Fit the model.
Parameters:  *args (list) – input, where the last
n_fit_targets
elements are considered as the target, while the others are considered as input;  n_epochs (int, None) – the number of training epochs;
 weights (np.ndarray, None) – the weights of each sample in the computation of the loss;
 epsilon (float, None) – the coefficient used for early stopping;
 patience (float, 1.) – the number of epochs to wait until stop the learning if not improving;
 validation_split (float, 1.) – the percentage of the dataset to use as training set;
 **kwargs (dict) – other parameters used by the fit method of the regressor.
 *args (list) – input, where the last

weights_size
¶ The size of the array of weights.
Type: Returns
