Learn R Programming

KrigInv (version 1.4.2)

EGIparallel: Efficient Global Inversion: parallel version to get batchsize locations at each iteration

Description

Sequential sampling based on the optimization of a kriging-based criterion, with model update after each iteration. The criterias aim at identifying an excursion set or one/many level sets. At each iteration batchsize new locations are evaluated. Different criteria are available for selecting experiments. The pointwise criteria are "bichon", "ranjan", "tmse", "tsee" and are fast to compute. These criteria can be used only with batchsize = 1. In addition, integral criteria require numerical integration and can potentially deliver more than one new location per iteration. Available integral criteria are "imse", "timse", "sur", "jn", "vorob", "vorobCons", "vorobVol".

Usage

EGIparallel(T, model, method = NULL, method.param=NULL,
fun, iter, batchsize = 1,
lower, upper, new.noise.var = 0,
optimcontrol = NULL, kmcontrol = NULL, integcontrol = NULL, ...)

Value

A list with components:

par

The added observations ((iter*batchsize) * d matrix)

value

The value of fun at the added observations (size: iter*batchsize)

nsteps

The number of added observations (=iter*batchsize).

lastmodel

The current (last) kriging model of km class.

lastvalue

The value of the criterion at the last added batch of points.

allvalues

If an optimization on a discrete set of points is chosen, the value of the criterion at all these points, for the last iteration, for the last point of the batch.

If method="vorobCons" or method="vorobVol" the list also has components:

current.CE

Conservative estimate computed on lastmodel.

allCE_lvs

The conservative estimate levels computed at each iteration.

Arguments

T

Array containing one or several thresholds. The criteria which can be used with multiple thresholds are "tmse", "timse", "sur", "jn".

model

A Kriging model of km class.

method

Criterion used for choosing observations.

method.param

Optional method parameters. For methods

  • "ranjan", "bichon",
    "tmse" and "timse": the tolerance value (scalar). If not provided, default value is used (1 for ranjan and bichon, 0 for tmse and timse).

  • "vorob": a list containing penalization (scalar, default=1), type I penalization, and typeEx,(character, default=">") either ">" or "<" denoting the type of excursion.

  • "vorobCons" and "vorobVol": a list containing penalization (scalar, default =1), typeEx (character, default = ">"), consLevel (scalar, default=0.95), n_discrete_design (scalar, default=500*model@d), design (data.frame, default=as.data.frame(sobol (n = method.param$n_discrete_design, dim = model@d)) ), pred (result of predict.km on model at design) and consVorbLevel, the conservative estimate Vorob'ev quantile computed from pred. See also the arguments alpha, pred, design, type from the function conservativeEstimate, package anMC, for more details.

batchsize

Number of points to sample simultaneously. The sampling criterion will return batchsize points at each iteration. Some criteria can be used only with batchsize = 1 (see description).

new.noise.var

Optional scalar value of the noise variance of the new observations.

fun

Objective function.

iter

Number of iterations.

lower

Vector containing the lower bounds of the variables to be optimized over.

upper

Vector containing the upper bounds of the variables to be optimized over.

optimcontrol

Optional list of control parameters for the optimization of the sampling criterion. The field method defines which optimization method is used: it can be either "genoud" (default) for an optimisation using the genoud algorithm, or "discrete" for an optimisation over a specified discrete set. If the field method is set to "genoud", one can set some parameters of this algorithm: pop.size (default : 50*d), max.generations (default : 10*d),
wait.generations (2), BFGSburnin (2) and the mutations P1, P2, up to P9 (see genoud). Numbers into brackets are the default values. If the field method is set to "discrete", one can set the field optim.points: p * d matrix corresponding to the p points where the criterion will be evaluated. If nothing is specified, 100*d points are chosen randomly. Finally, one can control the field optim.option in order to decide how to optimize the sampling criterion. If optim.option is set to 2 (default), batchsize sequential optimizations in dimension d are performed to find the optimum. If optim.option is set to 1, only one optimization in dimension batchsize*d is performed. This option is only available with "genoud". This option might provide more global and accurate solutions, but is a lot more expensive.

kmcontrol

Optional list representing the control variables for the re-estimation of the kriging model once new points are sampled. The items are the same as in km.

integcontrol

Optional list specifying the procedure to build the integration points and weights. Many options are possible. A) If nothing is specified, 100*d points are chosen using the Sobol sequence.
B) One can directly set the field integration.points (a p * d matrix) for prespecified integration points. In this case these integration points and the corresponding vector integration.weights will be used for all the iterations of the algorithm. C) If the field integration.points is not set then the integration points are renewed at each iteration. In that case one can control the number of integration points n.points (default: 100*d) and a specific distribution distrib. Possible values for distrib are: "sobol", "MC", "timse", "imse", "sur" and "jn" (default: "sobol"). C.1) The choice "sobol" corresponds to integration points chosen with the Sobol sequence in dimension d (uniform weight). C.2) The choice "MC" corresponds to points chosen randomly, uniformly on the domain. C.3) The choices "timse", "imse", "sur" and "jn" correspond to importance sampling distributions (unequal weights). It is strongly recommended to use the importance sampling distribution corresponding to the chosen sampling criterion. When important sampling procedures are chosen, n.points points are chosen using importance sampling among a discrete set of n.candidates points (default: n.points*10) which are distributed according to a distribution
init.distrib (default: "sobol"). Possible values for init.distrib are the space filling distributions "sobol" and "MC" or an user defined distribution "spec". The "sobol" and "MC" choices correspond to quasi random and random points in the domain. If the "spec" value is chosen the user must fill in manually the field init.distrib.spec to specify himself a n.candidates * d matrix of points in dimension d.

...

Other arguments of the target function fun.

Author

Clement Chevalier (University of Neuchatel, Switzerland)

Victor Picheny (INRA, Toulouse, France)

David Ginsbourger (IDIAP Martigny and University of Bern, Switzerland)

Dario Azzimonti (IDSIA, Switzerland)

Details

The function used to build the integration points and weights (based on the options specified in integcontrol) is the function integration_design

References

Azzimonti, D., Ginsbourger, D., Chevalier, C., Bect, J., and Richet, Y. (2018). Adaptive design of experiments for conservative estimation of excursion sets. Under revision. Preprint at hal-01379642

Chevalier C., Bect J., Ginsbourger D., Vazquez E., Picheny V., Richet Y. (2014), Fast parallel kriging-based stepwise uncertainty reduction with application to the identification of an excursion set, Technometrics, vol. 56(4), pp 455-465

Picheny V., Ginsbourger D., Roustant O., Haftka R.T., (2010) Adaptive designs of experiments for accurate approximation of a target region, J. Mech. Des. vol. 132(7)

Chevalier C. (2013) Fast uncertainty reduction strategies relying on Gaussian process models Ph.D Thesis, University of Bern

See Also

EGI, max_sur_parallel

Examples

Run this code
#EGIparallel

set.seed(9)
N <- 20 #number of observations
T <- c(20,60) #thresholds
testfun <- branin
lower <- c(0,0)
upper <- c(1,1)

#a 20 points initial design
design <- data.frame( matrix(runif(2*N),ncol=2) )
response <- testfun(design)

#km object with matern3_2 covariance
#params estimated by ML from the observations
model <- km(formula=~., design = design,
            response = response,covtype="matern3_2")

optimcontrol <- list(method="genoud",pop.size=50)
integcontrol <- list(distrib="sur",n.points=50)
iter <- 1
batchsize <- 6

if (FALSE) {
  obj <- EGIparallel(T=T,model=model,method="sur",batchsize=batchsize,
                     fun=testfun,iter=iter,lower=lower,upper=upper,
                     optimcontrol=optimcontrol,integcontrol=integcontrol)

  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model,T=T,main="probability of excursion",
                       type="pn",new.points=0,cex.points=2)

  print_uncertainty_2d(model=obj$lastmodel,T=T,
                       main="probability of excursion, parallel sur sampling",
                       type="pn",new.points=iter*batchsize,col.points.end="red",cex.points=2)
}

##############
#same example with noisy initial observations and noisy new observations
branin.noise <- function(x) return(branin(x)+rnorm(n=1,sd=30))

set.seed(9)
N <- 20;T <- c(20,60)
testfun <- branin.noise
lower <- c(0,0);upper <- c(1,1)

design <- data.frame( matrix(runif(2*N),ncol=2) )
response.noise <- apply(design,1,testfun)
response.noise - response

model.noise <- km(formula=~., design = design, response = response.noise,
                  covtype="matern3_2",noise.var=rep(30*30,times=N))

optimcontrol <- list(method="genoud",pop.size=50)
integcontrol <- list(distrib="sur",n.points=50)
iter <- 1
batchsize <- 6

if (FALSE) {
  obj <- EGIparallel(T=T,model=model.noise,method="sur",batchsize=batchsize,
                     fun=testfun,iter=iter,lower=lower,upper=upper,
                     optimcontrol=optimcontrol,integcontrol=integcontrol,
                     new.noise.var=10*10)


  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model.noise,T=T,
                       main="probability of excursion, noisy obs.",
                       type="pn",new.points=0,cex.points=2)

  print_uncertainty_2d(model=obj$lastmodel,T=T,
                       main="probability of excursion, parallel sur sampling, noisy obs.",
                       type="pn",new.points=iter*batchsize,col.points.end="red",cex.points=2)
}


##############
# Conservative estimates with non-noisy initial observations
if (FALSE) {
  testfun <- branin
  # The conservative sampling strategies
  # only work with 1 threshold
  T <- 20
  ## Minimize Type II error sampling

  # The list method.param contains all parameters for the
  # conservative estimate and the conservative sequential
  # strategy. Below are parameters for a type II strategy
  # with conservative estimates at 0.95
  method.param = list(penalization=0, # Type II strategy
                      typeEx=">", consLevel = 0.95,
                      n_discrete_design=500*model@d)
  # If the CE for the initial model is already computed
  # it is possible to pass the conservative Vorob'ev quantile
  # level with method.param$consVorbLevel

  obj_T2 <- EGIparallel(T=T,model=model,method="vorobCons",batchsize=batchsize,
                        fun=testfun,iter=iter,lower=lower,upper=upper,
                        optimcontrol=optimcontrol,
                        integcontrol=integcontrol,method.param=method.param)

  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model,T=T,main="probability of excursion",
                       type="pn",new.points=0,cex.points=2,consQuantile = obj_T2$allCE_lvs[1])

  print_uncertainty_2d(model=obj_T2$lastmodel,T=T,
                       main="probability of excursion, parallel Type II sampling",
                       type="pn",new.points=iter*batchsize,col.points.end="red",
                       cex.points=2,consQuantile = obj_T2$allCE_lvs[2])

  ## Maximize conservative estimate's volume
  # Set up method.param
  # Here we pass the conservative level computed
  # in the previous step for the initial model
  method.param = list(typeEx=">", consLevel = 0.95,
                      n_discrete_design=500*model@d,
                      consVorbLevel=obj_T2$allCE_lvs[1]
  )

  obj_consVol <- EGIparallel(T=T,model=model,method="vorobVol",batchsize=batchsize,
                             fun=testfun,iter=iter,lower=lower,upper=upper,
                             optimcontrol=optimcontrol,
                             integcontrol=integcontrol,method.param=method.param)

  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model,T=T,main="probability of excursion",
                       type="pn",new.points=0,cex.points=2,consQuantile = obj_consVol$allCE_lvs[1])

  print_uncertainty_2d(model=obj_consVol$lastmodel,T=T,
                       main="probability of excursion, parallel consVol sampling",
                       type="pn",new.points=iter*batchsize,col.points.end="red",
                       cex.points=2,consQuantile = obj_consVol$allCE_lvs[2])

}

Run the code above in your browser using DataLab