Supported distributions

The lhs package allows distribution to be defined in several ways. Here are examples of each way, for the uniform distribution on [8, 10]:

  1. Identifying the distribution by name (see below for supported distributions):

    dist_a = {'name': 'uniform', 'args': {'loc': 8, 'scale': 2}}
    
  2. Providing a frozen scipy.stats distribution:

    import scipy.stats
    dist_b = {'distribution': scipy.stats.uniform(loc=8, scale=2)}
    
  3. Providing the distribution’s percent point function (inverse of the CDF):

    dist_c = {'ppf': lambda p: 8 + 2 * p}
    # Alternatively, via the scipy.stats distribution.
    dist_d = {'ppf': scipy.stats.uniform(loc=8, scale=2).ppf}
    

Distributions provided by lhs.dist

The lhs.dist module defines the following distributions:

lhs.dist.constant(samples, value)

The constant distribution, which always returns value.

Examples

dist_R0 = {'name': 'constant', 'args': {'value': 2.53}}
lhs.dist.inverse_uniform(samples, **kwargs)

The continuous inverse-uniform distribution, where \(X \sim \left[ \mathcal{U}(a, b) \right]^{-1}\).

The lower and upper bounds may be defined in terms of the uniform distribution parameters (low and high), or in terms of their reciprocal (inv_low and inv_high). Any combination of these parameters may be used; see the examples below.

Examples

All four combinations of the uniform/reciprocal parameters produce identical results:

>>> from lhs.dist import inverse_uniform
>>> from lhs.sample import lhs_values
>>> # Define the sample locations.
>>> samples = [0, 0.5, 1]
>>> param_name = 'alpha'
>>> param_dist = {'name': 'inverse_uniform'}
>>> # Define the distribution in terms of 'low' and 'high'.
>>> param_dist['args'] = {'low': 5, 'high': 10}
>>> lhs_values(param_name, param_dist, samples)
array([0.1       , 0.13333333, 0.2       ])
>>> # Define the distribution in terms of 'inv_low' and 'high'.
>>> param_dist['args'] = {'inv_low': 0.2, 'high': 10}
>>> lhs_values(param_name, param_dist, samples)
array([0.1       , 0.13333333, 0.2       ])
>>> # Define the distribution in terms of 'low' and 'inv_high'.
>>> param_dist['args'] = {'low': 5, 'inv_high': 0.1}
>>> lhs_values(param_name, param_dist, samples)
array([0.1       , 0.13333333, 0.2       ])
>>> # Define the distribution in terms of 'inv_low' and 'inv_high'.
>>> param_dist['args'] = {'inv_low': 0.2, 'inv_high': 0.1}
>>> lhs_values(param_name, param_dist, samples)
array([0.1       , 0.13333333, 0.2       ])

Samples are drawn between the values of the lower and upper bounds, even when the lower bounds are greater than the upper bounds:

>>> import numpy as np
>>> from lhs.dist import inverse_uniform
>>> inverse_uniform(0.5, low=4, high=[6, 4, 2])
array([0.2       , 0.25      , 0.33333333])

Distributions provided by scipy.stats

All of the scipy.stats distributions can be identified by name, as well as by passing a distribution or its percent point function (inverse of the CDF).

Note

Check the documentation for each scipy.stats distribution to identify the relevant distributions parameters and how their values are interpreted.

Consider the Beta distribution provided by scipy.stats.beta, which accepts the following parameters:

  • Shape parameters a and b, also known as \(\alpha\) and \(\beta\);

  • Shift parameter loc, which defines the lower bound (default: 0); and

  • Scale parameter scale, which defines the range max - loc (default: 1).

We can define a Beta prior for the parameter a with shape parameters \(\alpha = 2\) and \(\beta = 5\), over the interval \([10, 15]\), with the following:

# Define the distribution for "a".
a_dist = {'name': 'beta', 'args': {'a': 2, 'b': 5, 'loc': 10, 'scale': 5}}

Consider the discrete distribution scipy.stats.randint, which returns integers in the range [low, high - 1] with uniform probability. We can define a prior for the parameter b that takes values between 1 and 10 with the following:

# Define the distribution for "b".
b_dist = {'name': 'randint', 'args': {'low': 1, 'high': 11}}

We can then sample values from this distribution with lhs.sample.lhs_values() or lhs.dist.sample_from():

>>> # Define the distribution for "b".
>>> b_dist = {'name': 'randint', 'args': {'low': 1, 'high': 11}}
>>> # Define the sample locations.
>>> import numpy as np
>>> samples = np.linspace(0.05, 0.95, 10)
>>> # Obtain values with lhs.sample.lhs_values().
>>> import lhs
>>> values_1 = lhs.sample.lhs_values('b', b_dist, samples)
>>> # Obtain values with lhs.dist.sample_from().
>>> values_2 = lhs.dist.sample_from(samples, 'randint', b_dist['args'])
>>> # These two arrays should contain identical values.
>>> assert np.allclose(values_1, values_2)
>>> # Both arrays should contain the integers 1 to 10 (inclusive).
>>> assert np.allclose(values_1, np.arange(1, 11))