lhs.design: Functions for accessing latin hypercube sampling designs from package lhs or space-filling designs from package DiceDesign

Description

Functions for comfortably accessing latin hypercube sampling designs from package lhs or space-filling designs from package DiceDesign, which are useful for quantitative factors with many possible levels. In particular, they can be used in computer experiments. Most of the designs are random samples.

Usage

lhs.design(nruns, nfactors, type="optimum", factor.names=NULL, seed=NULL, digits=NULL, 
         nlevels = nruns, default.levels = c(0, 1), randomize = FALSE, ...)
lhs.augment(lhs, m=1, type="optAugment", seed=NULL, ...)

Value

Both functions return a data frame of S3 class design

with attributes attached. The data frame contains the experimental settings as recoded to the scale ends defined in factor.names (if given), rounded to the number of digits given in digits (if given). The experimental factors in the matrix desnum attached as attribute desnum contain the design in the unit cube (all experimental factors ranging from 0 to 1) as returned by packages lhs or DiceDesign.

Function lhs.augment preserves additional variables (e.g. responses) that have been added to the design lhs before augmenting. Note, however, that the response data are NOT used in deciding about which points to augment the design with.

The attribute run.order is not very useful for most of these designs, as there is no standard order. It therefore is present for formal reasons only and contains three identical columns of 1,2,...,nruns. For designs created with type=fact or type=faure, the standard order is the order in which package DiceDesign creates the design, and the actual run order may be different in case of randomization.

In case of lhs.augment, if the design to be augmented had been reordered before, the augmented design preserves this reorder and also the respective numbering of the design.

The attribute design.info is a list of various design properties, with type resolving to “lhs”. In addition to the standard list elements (cf. design), the subtype

element indicates the type of latin hypercube designs and possibly additional augmentations, the element

quantitative is a vector of nfactor logical TRUEs, and the digits elements indicates the digits to which the data were rounded.

For designs created with package DiceDesign, special list elements from this package are also added to design.info.

randomize is always TRUE for designs generated by random sampling, but may be FALSE for designs created with type=fact or type=faure.

coding provides formulae for making the designs comfortably usable with standard second order methodology implemented in package rsm.

replications is always 1 and repeat.only is always FALSE; these elements are only present to fulfill the formal requirements for class design.

Arguments

nruns: number of runs in the latin hypercube sample;
for type fact (a full factorial with equally-space levels), if nlevels is not separately specified, this number is taken to be the common number of levels of all factors, i.e. the resulting design will have nruns^nfactors runs;
alternatively, if nlevels is separately specified as a vector of different numbers of levels, nruns can be missing or can be the correctly-specified number of runs.
nfactors: number of factors in the latin hypercube sample
type: character string indicating the type of design or augmentation method; defaults are “optimum” for lhs.design and “optAugment” for lhs.augment.

Function lhs.design calls
a function named typeLHS from package lhs (types genetic, improved, maximin, optimum, random),
a function named typeDesign from package DiceDesign (types dmax, strauss, fact)
or function runif.faure from package DiceDesign (type faure).

Function lhs.augment calls function typeLHS from package lhs,
where possible choices for type are augment, optSeeded, or optAugment.
see the respective functions from packages lhs and DiceDesign.
seed: seed for random number generation; latin hypercube samples from package lhs are random samples. Specifying a seed used to make the result reproducible for early versions of package lhs - lately, results are reproducible within a package version, but reproducibility between package versions cannot be guaranteed.
factor.names: list of scale end values for each factor; names are used as variable names;
the names should not be x1, x2, ..., as this would interfere with usability of standard second order analysis methods on the resulting data (link{rsmformula});
if the list is not named, the variable names are X1, X2 and so forth; the original unit cube calculated by package lhs (scale ends 0 and 1 for each variable) is rescaled to the values given in factor.names.
digits: digits to which the design columns are rounded; one single value (the same for all factors) or a vector of length nfactors;
note that the rounding is applied after generation of the design on the actual data scale, i.e. the unit cube generated by the functions from packages lhs or DiceDesign is NOT rounded
nlevels: used for type fact only; integer number or numeric vector of nfactor integers; specifies the number of levels for each factor. If all factors have the same number of levels, the number of levels can also be specified through nruns, which is interpreted as the number of levels for type fact, if nlevels is not separately specified
default.levels: scale ends for all factors; convenient, if all factors have the same scaling that deviates from the default 0/1 scale ends.
randomize: logical that prevents randomization per default. The option has an effect for types fact and faure only. All other types are based on random design generation anyway. Note that preventing randomization is the default here, because these designs are assumed to be used mostly for computer experimentation, where the systematics of the non-randomized design may be beneficial. For hardware experimentation, randomization should be set to TRUE!

If randomization is requested, the following information is relevant:
In R version 3.6.0 and later, the default behavior of function sample has changed. If you work in a new (i.e., >= 3.6.0) R version want to run code interchangeably on R 3.6.0 and an earlier R version, you have to change the RNGkind setting in the later R version by
RNGkind(sample.kind="Rounding")
before running function lhs.design.
It is recommended to change the setting back to the new recommended way afterwards:
RNGkind(sample.kind="default")
For an example, see the documentation of the example data set VSGFS.
lhs: design generated by function lhs.design (class design, of type lhs
m: integer number of additional points to add to design lhs (note, however, that optSeeded does not necessarily preserve all original runs!)
...: additional arguments to the functions from packages lhs or DiceDesign. Refer to their documentation.
Functions for generating lhs designs: randomLHS, geneticLHS, improvedLHS, maximinLHS, optimumLHS, dmaxDesign, straussDesign, runif.faure, factDesign;
functions for augmenting lhs designs: augmentLHS, optSeededLHS, optAugmentLHS)

Warning

Since R version 3.6.0, the behavior of function sample has changed (correction of a biased previous behavior that should not be relevant for the randomization of designs). For using code that randomizes a design interchangeably between a new R version (3.6.0 or later) and an older one, please follow the steps described with the argument randomize.

Note also: Package lhs does not promise to keep designs reproducible between package versions. Thus, please make sure to store important designs for the future, if needed (of course, this is always wise anyway!).

Author

Ulrike Groemping

Details

Function lhs.design creates a latin hypercube sample, function lhs.augment augments an existing latin hypercube sample (or in case of type optSeeded takes the existing sample as the starting point but potentially modifies it). In comparison to direct usage of package lhs, the functions add the possibility of recoding lhs samples to a desired range, and they embed the lhs designs into class design.
Range coding is based on the recoding facility from package rsm and the factor.names parameter used analogously to packages DoE.base and FrF2.

The lhs designs are useful for quantitative factors, if it is considered desirable to uniformly distribute design points over a hyperrectangular space. This is e.g. considered interesting for computer experiments, where replications of the same settings are often useless.

Supported design types are described in the documentation for packages lhs and DiceDesign.

References

Beachkofski, B., Grandhi, R. (2002) Improved Distributed Hypercube Sampling. American Institute of Aeronautics and Astronautics Paper 1274.

Currin C., Mitchell T., Morris M. and Ylvisaker D. (1991) Bayesian Prediction of Deterministic Functions With Applications to the Design and Analysis of Computer Experiments, Journal of the American Statistical Association 86, 953--963.

Santner T.J., Williams B.J. and Notz W.I. (2003) The Design and Analysis of Computer Experiments, Springer, 121--161.

Shewry, M. C. and Wynn and H. P. (1987) Maximum entropy sampling. Journal of Applied Statistics 14, 165--170.

Fang K.-T., Li R. and Sudjianto A. (2006) Design and Modeling for Computer Experiments, Chapman & Hall.

Stein, M. (1987) Large Sample Properties of Simulations Using Latin Hypercube Sampling. Technometrics 29, 143--151.

Stocki, R. (2005) A method to improve design reliability using optimal Latin hypercube sampling. Computer Assisted Mechanics and Engineering Sciences 12, 87--105.

Examples

Run this code

   ## maximin design from package lhs
   plan <- lhs.design(20,7,"maximin",digits=2) 
   plan
   plot(plan)
   cor(plan)
   y <- rnorm(20)
   r.plan <- add.response(plan, y)
   
   ## augmenting the design with 10 additional points, default method
   plan2 <- lhs.augment(plan, m=10)
   plot(plan2)
   cor(plan2)
   
   ## purely random design (usually not ideal)
   plan3 <- lhs.design(20,4,"random",
          factor.names=list(c(15,25), c(10,90), c(0,120), c(12,24)), digits=2)
   plot(plan3)
   cor(plan3)
   
   ## optimum design from package lhs (default)
   plan4 <- lhs.design(20,4,"optimum",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan4)
   cor(plan4)
   
   ## dmax design from package DiceDesign
   ## arguments range and niter_max are required
   ## ?dmaxDesign for more info
   plan5 <- lhs.design(20,4,"dmax",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2,
              range=0.2, niter_max=500)
   plot(plan5)
   cor(plan5)
   
   ## Strauss design from package DiceDesign
   ## argument RND is required
   ## ?straussDesign for more info
   plan6 <- lhs.design(20,4,"strauss",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2,
              RND = 0.2)
   plot(plan6)
   cor(plan6)
   
   ## full factorial design from package DiceDesign
   ## mini try-out version
   plan7 <- lhs.design(3,4,"fact",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan7)
   cor(plan7)
   
   if (FALSE) {
   
   ## full factorial design from package DiceDesign
   ## not as many different levels as runs, but only a fixed set of levels
   ##    caution: too many levels can easily bring down the computer
   ##    above design with 7 distinct levels for each factor, 
   ##    implying 2401 runs 
   plan7 <- lhs.design(7,4,"fact",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan7)
   cor(plan7)
   
   ## equivalent call
   plan7 <- lhs.design(,4,"fact",nlevels=7,
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   
   ## different number of levels for each factor
   plan8 <- lhs.design(,4,"fact",nlevels=c(5,6,5,7),
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan8)
   cor(plan8)

   ## equivalent call (specifying nruns, not necessary but a good check)
   plan8 <- lhs.design(1050,4,"fact",nlevels=c(5,6,5,7),
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   }

Run the code above in your browser using DataLab