Drawing samples

lhs.draw(rng, n, params, dep_params=None, dep_fn=None, values=None)

Return samples from the provided parameter distributions.

Parameters:

rng – A random number generator.
n – The number of subsets to sample.
params – The independent parameter distributions.
dep_params – The (optional) dependent parameter details.
dep_fn – The (optional) function that defines dependent parameter distributions, given the sample values for each independent parameter.
values – An (optional) table of parameter values that have already been sampled. This can be useful when some parameters are dependent on parameters whose sample values are read from, e.g., external data files.

Raises:

ValueError – if only one of dep_params and dep_fn is set to None.

Note

The order in which the distributions are defined matters. Samples are drawn for each distribution in turn. If you want to ensure that certain parameters have reproducible samples when drawing values for multiple simulations, the parameter ordering must be consistent. This means that additional parameters that are only defined for certain simulations should be defined after all of the common parameters. See the example below for a demonstration.

Examples:

>>> import lhs
>>> import numpy as np
>>> # Define X ~ U(0, 1).
>>> dist_x = {'x': {'name': 'uniform', 'args': {'loc': 0, 'scale': 1}}}
>>> # Define X ~ U(0, 1) and Y ~ U(0, 1).
>>> dist_xy = {
...     'x': {'name': 'uniform', 'args': {'loc': 0, 'scale': 1}},
...     'y': {'name': 'uniform', 'args': {'loc': 0, 'scale': 1}},
... }
>>> # Define Y ~ U(0, 1) and X ~ U(0, 1).
>>> dist_yx = {
...     'y': {'name': 'uniform', 'args': {'loc': 0, 'scale': 1}},
...     'x': {'name': 'uniform', 'args': {'loc': 0, 'scale': 1}},
... }
>>> n = 10
>>> # Draw samples for X.
>>> rand = np.random.default_rng(seed=12345)
>>> samples_x = lhs.draw(rand, n, dist_x)
>>> # Draw samples for X and Y; we should obtain identical samples for X.
>>> rand = np.random.default_rng(seed=12345)
>>> samples_xy = lhs.draw(rand, n, dist_xy)
>>> assert np.array_equal(samples_x['x'], samples_xy['x'])
>>> # Draw samples for Y and X; we should obtain different samples for X.
>>> rand = np.random.default_rng(seed=12345)
>>> samples_yx = lhs.draw(rand, n, dist_yx)
>>> assert not np.array_equal(samples_x['x'], samples_yx['x'])

Support functions

lhs.sample.sample_subspace(rng, n, shape=None, broadcast=None)

Return samples from n equal subsets of the unit interval (or for many unit intervals).

Parameters:

rng – A random number generator.
n – The number of subsets to sample.
shape – The (optional) number of unit intervals to sample. This can be an integer, or a sequence of integers (list or tuple).
broadcast – An (optional) list of integers used to broadcast the samples to additional dimensions. Values greater than 0 indicate a new dimension, values less than or equal to zero indicate an existing dimension. See the code listings below for an example.

Examples:

>>> import lhs.sample
>>> import numpy as np
>>> # Draw 10 samples from the unit interval.
>>> rng = np.random.default_rng(seed=20201217)
>>> n = 10
>>> samples = lhs.sample.sample_subspace(rng, n)
>>> # Ensure there is one sample in each of [0, 0.1], [0.1, 0.2], etc.
>>> lower_bounds = np.linspace(0, 1 - 1/n, num=n)
>>> upper_bounds = np.linspace(1/n, 1, num=n)
>>> for (lower, upper) in zip(lower_bounds, upper_bounds):
...     in_interval = np.logical_and(samples >= lower, samples <= upper)
...     assert sum(in_interval) == 1

Example of broadcasting samples to additional dimensions:

>>> import lhs.sample
>>> import numpy as np
>>> alpha_dist = {
...     'name': 'beta',
...     'args': {'a': 1, 'b': 1},
...     'shape': 2,
...     # Broadcast from (10 x 2) to (10 x 3 x 2 x 4).
...     'broadcast': [3, 0, 4],
... }
>>> rng = np.random.default_rng(12345)
>>> num_samples = 10
>>> samples = lhs.draw(rng, num_samples, {'alpha': alpha_dist})
>>> assert samples['alpha'].shape == (10, 3, 2, 4)

lhs.sample.sample_subspaces(rng, n, param_shapes)

Return samples from n equal subsets of the unit interval (or for many unit intervals) for each parameter in params.

Parameters:

rng – A random number generator.
n – The number of subsets to sample.
param_shapes – A sequence of (name, shape) and/or (name, shape, broadcast) tuples.

lhs.sample.lhs_values(name, dist, samples)

Return values drawn from the inverse CDF of dist for each sample in the unit interval.

Parameters:

name – The parameter name (used for error messages).
dist – The sampling distribution details.
samples – Samples from the unit interval.

Examples:

>>> import lhs.sample
>>> import numpy as np
>>> import scipy.stats
>>> # Define the sample locations.
>>> samples = np.array([0.0, 0.25, 0.50, 0.75, 1.0])
>>> # Identify the sampling distribution by name.
>>> dist_1 = {
...     'name': 'beta',
...     'args': {'a': 2, 'b': 5},
... }
>>> # Provide the sampling distribution object.
>>> dist_2 = {
...     'distribution': scipy.stats.beta(a=2, b=5),
... }
>>> # Identify the sampling distribution percent point function.
>>> dist_3 = {
...     'ppf': scipy.stats.beta(a=2, b=5).ppf,
... }
>>> # Ensure we obtain the same values from all three distributions.
>>> values_1 = lhs.sample.lhs_values('dist_1', dist_1, samples)
>>> values_2 = lhs.sample.lhs_values('dist_2', dist_2, samples)
>>> values_3 = lhs.sample.lhs_values('dist_3', dist_3, samples)
>>> assert np.allclose(values_1, values_2)
>>> assert np.allclose(values_2, values_3)
>>> assert np.allclose(values_1, values_3)

lhs.dist.sample_from(samples, dist_name, dist_kwargs)

Sample from a distribution by evaluating the quantile function.

Parameters:

samples – The values for which to evaluate the quantile function.
dist_name – The name of the distribution to sample.
dist_kwargs – The (distribution-specific) shape parameters.

Returns:

The sample values as a numpy.ndarray that has the same shape as samples.

Raises:

ValueError – if the distribution dist_name is not defined.

Examples:

>>> import lhs.dist
>>> import numpy as np
>>> samples = np.array([0.1, 0.5, 0.9])
>>> kwargs = {'loc': 10, 'scale': 5}
>>> values = lhs.dist.sample_from(samples, 'uniform', kwargs)
>>> print(values)
[10.5 12.5 14.5]