Latin hypercube sampler

Welcome to the lhs documentation. This package implements Latin hypercube sampling in order to draw near-random samples of parameter values from multi-dimensional distributions. This package is primarily intended for scenario modelling.

License

The code is distributed under the terms of the BSD 3-Clause license (see LICENSE), and the documentation is distributed under the terms of the Creative Commons BY-SA 4.0 license.

A simple example

>>> import lhs
>>> import numpy as np
>>> import scipy.stats
>>> # Define normal distributions for alpha, beta, and gamma.
>>> alpha = {'name': 'norm', 'args': {'loc': 5, 'scale': 1}}
>>> beta = {'name': 'norm', 'args': {'loc': 10, 'scale': 2}}
>>> gamma = {'distribution': scipy.stats.norm(loc=0, scale=0.1)}
>>> # Collect the distributions in a dictionary.
>>> distributions = {'alpha': alpha, 'beta': beta, 'gamma': gamma}
>>> # Define the number of samples to draw.
>>> num_samples = 100
>>> # Create a pseudo-random number generator.
>>> seed = 12345
>>> rand = np.random.default_rng(seed=seed)
>>> # Draw samples for all three parameters.
>>> # This returns a dictionary of 1-D NumPy arrays.
>>> samples = lhs.draw(rand, num_samples, distributions)
>>> # Print the first sampled value for alpha.
>>> print(samples['alpha'][0])
4.538220677728063
>>> # Collect the samples for each parameter into a 2-D array.
>>> table = np.column_stack([values for values in samples.values()])
>>> # Print the first five samples.
>>> print(table[:5])
[[ 4.53822068e+00  1.14024721e+01  6.26798628e-02]
 [ 3.48555342e+00  1.15887999e+01 -8.20221380e-02]
 [ 6.91828587e+00  1.34933849e+01  3.75658088e-04]
 [ 6.02953073e+00  1.44184693e+01 -1.67071425e-01]
 [ 2.08898887e+00  9.06176782e+00  2.83558279e-02]]
>>> # Save the samples table to a text file.
>>> # np.savetxt('samples.out', table)

A more detailed example

  1. Define a distribution for each independent parameter (in this example, R0 and eta):

    def independent_params():
        return {
            # Basic reproduction number.
            'R0': {
                'name': 'constant',
                'args': {'value': 2.53},
                'shape': 1,
            },
            # Proportion of infections that require hospitalisation ("severe").
            # See doi:10.1371/journal.pone.0014505 for details.
            'eta': {
                'distribution': scipy.stats.loguniform(a=0.05, b=0.1),
                'shape': 1,
            },
        }
    

    Note

    See Supported distributions for details about defining distributions.

  2. Define the name and size of each dependent parameter (if any; in this example, alpha_m):

    def dependent_params():
        return {
            # Proportion of non-severe infections that seek clinical care.
            # See doi:10.1371/journal.pone.0014505 for details.
            'alpha_m': {'shape': 1},
        }
    
  3. Define the distribution for each dependent parameter (if any), given the sampled values for every independent parameter, in a separate function:

    def dependent_dists(indep_values, dep_params):
        # Scale eta from [0.001, 0.1] to [0.2, 1.0]
        eta = indep_values['eta']
        scale = (100 * eta - 0.1) * 0.8 / 9.9 + 0.2
        # Calculate the mean and bounds for alpha_m.
        mean_val = scale * 0.5
        min_val = scale * 0.25
        max_val = scale * 0.75
        # Calculate the mean and variance for alpha_m.
        varn = scale * 0.1
        span = max_val - min_val
        scaled_mean = (mean_val - min_val) / span
        scaled_var = varn / span
        # Calculate the shape parameters for alpha_m.
        a = np.square(scaled_mean / scaled_var) * (1 - scaled_mean) - scaled_mean
        b = a * (1 - scaled_mean) / scaled_mean
    
        dist = scipy.stats.beta(a=a, b=b, loc=min_val, scale=span)
    
        return {
            'alpha_m': {'distribution': dist},
        }
    
  4. Draw samples for all of the model parameters:

    def draw_samples(seed, num_samples):
        rng = np.random.default_rng(seed=seed)
        samples = lhs.draw(rng, num_samples, independent_params(),
                           dependent_params(), dependent_dists)
        return samples
    

    Note

    If there are no dependent parameters, you can omit the dependent parameter arguments and simply call:

    samples = lhs.draw(rng, num_samples, independent_params())
    

    See Drawing samples for further details about drawing samples.

  5. You can then use these samples to, e.g., define initial model parameters.

    def test_draw_samples():
        samples = draw_samples(seed=12345, num_samples=1000)
        # Ensure that R0 is constant.
        assert(all(samples['R0'] == 2.53))
        # Ensure that all eta values lie within the interval bounds.
        assert(all(samples['eta'] > 0.05))
        assert(all(samples['eta'] < 0.10))
        # Ensure that between 5% and 75% of infections lead to non-severe clinical
        # presentations.
        assert(all(samples['alpha_m'] > 0.05))
        assert(all(samples['alpha_m'] < 0.75))