Distributions

class Distribution(context_shape=None)[source]

Bases: Serializable

Interface for Distributions to represent a generic probability distribution. Probability distributions are often used by black box optimization algorithms in order to perform exploration in parameter space. In the literature, they are also known as high level policies.

__init__(context_shape=None)[source]
sample(context=None)[source]

Draw a sample from the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

A random vector sampled from the distribution.

log_pdf(theta, context=None)[source]

Compute the logarithm of the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the log pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the log pdf in the specified point.

__call__(theta, context=None)[source]

Compute the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the pdf in the specified point.

entropy(context=None)[source]

Compute the entropy of the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

The value of the entropy of the distribution.

mle(theta, weights=None)[source]

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:
  • theta (np.ndarray) – a set of points, every row is a sample;

  • weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta, context=None)[source]

Compute the derivative of the logarithm of the probability density function in the specified point.

Parameters:
  • theta (np.ndarray) – the point where the gradient of the log pdf is computed;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the log pdf in the specified point.

diff(theta, context=None)[source]

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]
Parameters:
  • theta (np.ndarray) – the point where the gradient of the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the pdf in the specified point.

get_parameters()[source]

Getter.

Returns:

The current distribution parameters.

set_parameters(rho)[source]

Setter.

Parameters:

rho (np.ndarray) – the vector of the new parameters to be used by the distribution.

property parameters_size

Property.

Returns:

The size of the distribution parameters.

_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent. For every attribute, it is necessary to specify the method to be used to save and load. Available methods are: numpy, mushroom, torch, json, pickle, primitive and none. The primitive method can be used to store primitive attributes, while the none method always skip the attribute, but ensure that it is initialized to None after the load. The mushroom method can be used with classes that implement the Serializable interface. All the other methods use the library named. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.

Parameters:

**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them.

_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:

A deepcopy of the agent.

classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:

path (Path, string) – Relative or absolute path to the agents save location.

Returns:

The loaded agent.

save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;

  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.

save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;

  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;

  • folder (string, '') – subfolder to be used by the save method.

Gaussian

class GaussianDistribution(mu, sigma)[source]

Bases: Distribution

Gaussian distribution with fixed covariance matrix. The parameters vector represents only the mean.

__init__(mu, sigma)[source]

Constructor.

Parameters:
  • mu (np.ndarray) – initial mean of the distribution;

  • sigma (np.ndarray) – covariance matrix of the distribution.

sample(context=None)[source]

Draw a sample from the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

A random vector sampled from the distribution.

log_pdf(theta, context=None)[source]

Compute the logarithm of the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the log pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the log pdf in the specified point.

__call__(theta, context=None)[source]

Compute the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the pdf in the specified point.

entropy(context=None)[source]

Compute the entropy of the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

The value of the entropy of the distribution.

mle(theta, weights=None)[source]

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:
  • theta (np.ndarray) – a set of points, every row is a sample;

  • weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta, context=None)[source]

Compute the derivative of the logarithm of the probability density function in the specified point.

Parameters:
  • theta (np.ndarray) – the point where the gradient of the log pdf is computed;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the log pdf in the specified point.

get_parameters()[source]

Getter.

Returns:

The current distribution parameters.

set_parameters(rho)[source]

Setter.

Parameters:

rho (np.ndarray) – the vector of the new parameters to be used by the distribution.

property parameters_size

Property.

Returns:

The size of the distribution parameters.

_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent. For every attribute, it is necessary to specify the method to be used to save and load. Available methods are: numpy, mushroom, torch, json, pickle, primitive and none. The primitive method can be used to store primitive attributes, while the none method always skip the attribute, but ensure that it is initialized to None after the load. The mushroom method can be used with classes that implement the Serializable interface. All the other methods use the library named. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.

Parameters:

**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them.

_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:

A deepcopy of the agent.

diff(theta, context=None)

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]
Parameters:
  • theta (np.ndarray) – the point where the gradient of the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the pdf in the specified point.

classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:

path (Path, string) – Relative or absolute path to the agents save location.

Returns:

The loaded agent.

save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;

  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.

save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;

  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;

  • folder (string, '') – subfolder to be used by the save method.

class GaussianDiagonalDistribution(mu, std)[source]

Bases: Distribution

Gaussian distribution with diagonal covariance matrix. The parameters vector represents the mean and the standard deviation for each dimension.

__init__(mu, std)[source]

Constructor.

Parameters:
  • mu (np.ndarray) – initial mean of the distribution;

  • std (np.ndarray) – initial vector of standard deviations for each variable of the distribution.

sample(context=None)[source]

Draw a sample from the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

A random vector sampled from the distribution.

log_pdf(theta, context=None)[source]

Compute the logarithm of the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the log pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the log pdf in the specified point.

__call__(theta, context=None)[source]

Compute the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the pdf in the specified point.

entropy(context=None)[source]

Compute the entropy of the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

The value of the entropy of the distribution.

mle(theta, weights=None)[source]

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:
  • theta (np.ndarray) – a set of points, every row is a sample;

  • weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta, context=None)[source]

Compute the derivative of the logarithm of the probability density function in the specified point.

Parameters:
  • theta (np.ndarray) – the point where the gradient of the log pdf is computed;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the log pdf in the specified point.

get_parameters()[source]

Getter.

Returns:

The current distribution parameters.

set_parameters(rho)[source]

Setter.

Parameters:

rho (np.ndarray) – the vector of the new parameters to be used by the distribution.

property parameters_size

Property.

Returns:

The size of the distribution parameters.

_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent. For every attribute, it is necessary to specify the method to be used to save and load. Available methods are: numpy, mushroom, torch, json, pickle, primitive and none. The primitive method can be used to store primitive attributes, while the none method always skip the attribute, but ensure that it is initialized to None after the load. The mushroom method can be used with classes that implement the Serializable interface. All the other methods use the library named. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.

Parameters:

**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them.

_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:

A deepcopy of the agent.

diff(theta, context=None)

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]
Parameters:
  • theta (np.ndarray) – the point where the gradient of the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the pdf in the specified point.

classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:

path (Path, string) – Relative or absolute path to the agents save location.

Returns:

The loaded agent.

save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;

  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.

save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;

  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;

  • folder (string, '') – subfolder to be used by the save method.

class GaussianCholeskyDistribution(mu, sigma)[source]

Bases: Distribution

Gaussian distribution with full covariance matrix. The parameters vector represents the mean and the Cholesky decomposition of the covariance matrix. This parametrization enforce the covariance matrix to be positive definite.

__init__(mu, sigma)[source]

Constructor.

Parameters:
  • mu (np.ndarray) – initial mean of the distribution;

  • sigma (np.ndarray) – initial covariance matrix of the distribution.

sample(context=None)[source]

Draw a sample from the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

A random vector sampled from the distribution.

log_pdf(theta, context=None)[source]

Compute the logarithm of the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the log pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the log pdf in the specified point.

__call__(theta, context=None)[source]

Compute the probability density function in the specified point

Parameters:
  • theta (np.ndarray) – the point where the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The value of the pdf in the specified point.

entropy(context=None)[source]

Compute the entropy of the distribution.

Parameters:

context (Array, None) – context variables to condition the distribution.

Returns:

The value of the entropy of the distribution.

mle(theta, weights=None)[source]

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:
  • theta (np.ndarray) – a set of points, every row is a sample;

  • weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta, context=None)[source]

Compute the derivative of the logarithm of the probability density function in the specified point.

Parameters:
  • theta (np.ndarray) – the point where the gradient of the log pdf is computed;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the log pdf in the specified point.

get_parameters()[source]

Getter.

Returns:

The current distribution parameters.

set_parameters(rho)[source]

Setter.

Parameters:

rho (np.ndarray) – the vector of the new parameters to be used by the distribution.

property parameters_size

Property.

Returns:

The size of the distribution parameters.

_add_save_attr(**attr_dict)

Add attributes that should be saved for an agent. For every attribute, it is necessary to specify the method to be used to save and load. Available methods are: numpy, mushroom, torch, json, pickle, primitive and none. The primitive method can be used to store primitive attributes, while the none method always skip the attribute, but ensure that it is initialized to None after the load. The mushroom method can be used with classes that implement the Serializable interface. All the other methods use the library named. If a “!” character is added at the end of the method, the field will be saved only if full_save is set to True.

Parameters:

**attr_dict – dictionary of attributes mapped to the method that should be used to save and load them.

_post_load()

This method can be overwritten to implement logic that is executed after the loading of the agent.

copy()
Returns:

A deepcopy of the agent.

diff(theta, context=None)

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]
Parameters:
  • theta (np.ndarray) – the point where the gradient of the pdf is calculated;

  • context (Array, None) – context variables to condition the distribution.

Returns:

The gradient of the pdf in the specified point.

classmethod load(path)

Load and deserialize the agent from the given location on disk.

Parameters:

path (Path, string) – Relative or absolute path to the agents save location.

Returns:

The loaded agent.

save(path, full_save=False)

Serialize and save the object to the given path on disk.

Parameters:
  • path (Path, str) – Relative or absolute path to the object save location;

  • full_save (bool) – Flag to specify the amount of data to save for MushroomRL data structures.

save_zip(zip_file, full_save, folder='')

Serialize and save the agent to the given path on disk.

Parameters:
  • zip_file (ZipFile) – ZipFile where te object needs to be saved;

  • full_save (bool) – flag to specify the amount of data to save for MushroomRL data structures;

  • folder (string, '') – subfolder to be used by the save method.