Distributions¶

class mushroom_rl.distributions.distribution.Distribution[source]¶

Bases: object

Interface for Distributions to represent a generic probability distribution. Probability distributions are often used by black box optimization algorithms in order to perform exploration in parameter space. In literature, they are also known as high level policies.

sample()[source]¶

Draw a sample from the distribution.

Returns:	A random vector sampled from the distribution.

log_pdf(theta)[source]¶

Compute the logarithm of the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the log pdf is calculated
Returns:	The value of the log pdf in the specified point.

__call__(theta)[source]¶

Compute the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the pdf is calculated
Returns:	The value of the pdf in the specified point.

mle(theta, weights=None)[source]¶

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:	theta (np.ndarray) – a set of points, every row is a sample weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta)[source]¶

Compute the derivative of the gradient of the probability denstity function in the specified point.

Parameters:	theta (np.ndarray) – the point where the gradient of the log pdf is calculated –
Returns:	The gradient of the log pdf in the specified point.

diff(theta)[source]¶

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]

Parameters:	theta (np.ndarray) – the point where the gradient of the pdf is calculated. –
Returns:	The gradient of the pdf in the specified point.

get_parameters()[source]¶

Getter.

Returns:	The current distribution parameters.

set_parameters(rho)[source]¶

Setter.

Parameters:	rho (np.ndarray) – the vector of the new parameters to be used by the distribution

parameters_size¶

Property.

Returns:	The size of the distribution parameters.

__init__¶: Initialize self. See help(type(self)) for accurate signature.

Gaussian¶

class mushroom_rl.distributions.gaussian.GaussianDistribution(mu, sigma)[source]¶

Bases: mushroom_rl.distributions.distribution.Distribution

Gaussian distribution with fixed covariance matrix. The parameters vector represents only the mean.

__init__(mu, sigma)[source]¶: Initialize self. See help(type(self)) for accurate signature.

sample()[source]¶

Draw a sample from the distribution.

Returns:	A random vector sampled from the distribution.

log_pdf(theta)[source]¶

Compute the logarithm of the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the log pdf is calculated
Returns:	The value of the log pdf in the specified point.

__call__(theta)[source]¶

Compute the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the pdf is calculated
Returns:	The value of the pdf in the specified point.

mle(theta, weights=None)[source]¶

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:	theta (np.ndarray) – a set of points, every row is a sample weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta)[source]¶

Compute the derivative of the gradient of the probability denstity function in the specified point.

Parameters:	theta (np.ndarray) – the point where the gradient of the log pdf is calculated –
Returns:	The gradient of the log pdf in the specified point.

get_parameters()[source]¶

Getter.

Returns:	The current distribution parameters.

set_parameters(rho)[source]¶

Setter.

Parameters:	rho (np.ndarray) – the vector of the new parameters to be used by the distribution

parameters_size¶

Property.

Returns:	The size of the distribution parameters.

diff(theta)¶

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]

Parameters:	theta (np.ndarray) – the point where the gradient of the pdf is calculated. –
Returns:	The gradient of the pdf in the specified point.

class mushroom_rl.distributions.gaussian.GaussianDiagonalDistribution(mu, std)[source]¶

Bases: mushroom_rl.distributions.distribution.Distribution

Gaussian distribution with diagonal covariance matrix. The parameters vector represents the mean and the standard deviation for each dimension.

__init__(mu, std)[source]¶: Initialize self. See help(type(self)) for accurate signature.

sample()[source]¶

Draw a sample from the distribution.

Returns:	A random vector sampled from the distribution.

log_pdf(theta)[source]¶

Compute the logarithm of the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the log pdf is calculated
Returns:	The value of the log pdf in the specified point.

__call__(theta)[source]¶

Compute the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the pdf is calculated
Returns:	The value of the pdf in the specified point.

mle(theta, weights=None)[source]¶

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:	theta (np.ndarray) – a set of points, every row is a sample weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta)[source]¶

Compute the derivative of the gradient of the probability denstity function in the specified point.

Parameters:	theta (np.ndarray) – the point where the gradient of the log pdf is calculated –
Returns:	The gradient of the log pdf in the specified point.

get_parameters()[source]¶

Getter.

Returns:	The current distribution parameters.

set_parameters(rho)[source]¶

Setter.

Parameters:	rho (np.ndarray) – the vector of the new parameters to be used by the distribution

parameters_size¶

Property.

Returns:	The size of the distribution parameters.

diff(theta)¶

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]

Parameters:	theta (np.ndarray) – the point where the gradient of the pdf is calculated. –
Returns:	The gradient of the pdf in the specified point.

class mushroom_rl.distributions.gaussian.GaussianCholeskyDistribution(mu, sigma)[source]¶

Bases: mushroom_rl.distributions.distribution.Distribution

Gaussian distribution with full covariance matrix. The parameters vector represents the mean and the Cholesky decomposition of the covariance matrix. This parametrization enforce the covariance matrix to be positive definite.

__init__(mu, sigma)[source]¶: Initialize self. See help(type(self)) for accurate signature.

sample()[source]¶

Draw a sample from the distribution.

Returns:	A random vector sampled from the distribution.

log_pdf(theta)[source]¶

Compute the logarithm of the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the log pdf is calculated
Returns:	The value of the log pdf in the specified point.

__call__(theta)[source]¶

Compute the probability density function in the specified point

Parameters:	theta (np.ndarray) – the point where the pdf is calculated
Returns:	The value of the pdf in the specified point.

mle(theta, weights=None)[source]¶

Compute the (weighted) maximum likelihood estimate of the points, and update the distribution accordingly.

Parameters:	theta (np.ndarray) – a set of points, every row is a sample weights (np.ndarray, None) – a vector of weights. If specified the weighted maximum likelihood estimate is computed instead of the plain maximum likelihood. The number of elements of this vector must be equal to the number of rows of the theta matrix.

diff_log(theta)[source]¶

Compute the derivative of the gradient of the probability denstity function in the specified point.

Parameters:	theta (np.ndarray) – the point where the gradient of the log pdf is calculated –
Returns:	The gradient of the log pdf in the specified point.

get_parameters()[source]¶

Getter.

Returns:	The current distribution parameters.

set_parameters(rho)[source]¶

Setter.

Parameters:	rho (np.ndarray) – the vector of the new parameters to be used by the distribution

parameters_size¶

Property.

Returns:	The size of the distribution parameters.

diff(theta)¶

Compute the derivative of the probability density function, in the specified point. Normally it is computed w.r.t. the derivative of the logarithm of the probability density function, exploiting the likelihood ratio trick, i.e.:

\[\nabla_{\rho}p(\theta)=p(\theta)\nabla_{\rho}\log p(\theta)\]

Parameters:	theta (np.ndarray) – the point where the gradient of the pdf is calculated. –
Returns:	The gradient of the pdf in the specified point.