Learn R Programming

kergp (version 0.5.7)

mle-methods: Maximum Likelihood Estimation of Gaussian Process Model Parameters

Description

Maximum Likelihood estimation of Gaussian Process models which covariance structure is described in a covariance kernel object.

Usage

# S4 method for covAll
mle(object, 
    y, X, F = NULL, beta = NULL,
    parCovIni = coef(object),
    parCovLower = coefLower(object), 
    parCovUpper = coefUpper(object),
    noise = TRUE, varNoiseIni = var(y) / 10,
    varNoiseLower = 0, varNoiseUpper = Inf,
    compGrad = hasGrad(object),
    doOptim = TRUE,
    optimFun = c("nloptr::nloptr", "stats::optim"),
    optimMethod = ifelse(compGrad, "NLOPT_LD_LBFGS", "NLOPT_LN_COBYLA"),
    optimCode = NULL,
    multistart = 1,
    parTrack = FALSE, trace  = 0, checkNames = TRUE,
    ...)

Value

A list with elements hopefully having understandable names.

opt

List of optimization results if it was successful, or an error object otherwise.

coef.kernel

The vector of 'kernel' coefficients. This will include one or several variance parameters.

coef.trend

Estimate of the vector \(\boldsymbol{\beta}\) of the trend coefficients.

parTracked

A matrix with rows giving the successive iterates during optimization if the parTrack argument was set to TRUE.

Arguments

object

An object representing a covariance kernel.

y

Response vector.

X

Spatial (or input) design matrix.

F

Trend matrix.

beta

Vector of trend coefficients if known/fixed.

parCovIni

Vector with named elements or matrix giving the initial values for the parameters. See Examples. When this argument is omitted, the vector of covariance parameters given in object is used if multistart == 1; If multistart > 1, a matrix of parameters is simulated by using simulPar. Remind that you can use the coef and coef<- methods to get and set this slot of the covariance object.

parCovLower

Lower bounds for the parameters. When this argument is omitted, the vector of parameters lower bounds in the covariance given in object is used. You can use coefLower and coefLower<- methods to get and set this slot of the covariance object.

parCovUpper

Upper bounds for the parameters. When this argument is omitted, the vector of parameters lower bounds in the covariance given in object is used. You can use coefUpper and coefUpper<- methods to get and set this slot of the covariance object.

noise

Logical. Should a noise be added to the error term?

varNoiseIni

Initial value for the noise variance.

varNoiseLower

Lower bound for the noise variance. Should be <= varNoiseIni.

varNoiseUpper

Upper bound for the noise variance. Should be >= varNoiseIni.

compGrad

Logical: compute and use the analytical gradient in optimization? This is only possible when object provides the analytical gradient.

doOptim

Logical. If FALSE no optimization is done.

optimFun

Function used for optimization. The two pre-defined choices are nloptr::nloptr (default) and stats::optim, both in combination with a few specific optimization methods. Ignored if optimCode is provided.

optimMethod

Name of the optimization method or algorithm. This is passed as the "algorithm" element of the opts argument when nloptr::nloptr is used (default), or to the method argument when stats::optim is used. When another value of optimFun is given, the value of optimMethod is ignored. Ignored if optimCode is provided. Use optimMethods to obtain the list of usable values.

optimCode

An object with class "expression" or "character" representing a user-written R code to be parsed and performing the log-likelihood maximization. Notice that this argument will bypass optimFun and optimMethod. The expression must define an object named "opt", which is either a list containing optimization results, either an object inheriting from "try-error" to cope with the case where a problem occurred during the optimization.

multistart

Number of optimizations to perform from different starting points (see parCovIni). Parallel backend is encouraged.

parTrack

If TRUE, the parameter vectors used during the optimization are tracked and returned as a matrix.

trace

Integer level of verbosity.

checkNames

if TRUE (default), check the compatibility of X with object, see checkX.

...

Further arguments to be passed to the optimization function, nloptr or optim.

Author

Y. Deville, O. Roustant

Details

The choice of optimization method is as follows.

  • When optimFun is nloptr:nloptr, it is assumed that we are minimizing the negative log-likelihood \(- \log L\). Note that both predefined methods "NLOPT_LD_LBFGS" and "NLOPT_LN_COBYLA" can cope with a non-finite value of the objective, except for the initial value of the parameter. Non-finite values of \(- \log L\) are often met in practice during optimization steps. The method "NLOPT_LD_LBFGS" used when compGrad is TRUE requires that the gradient is provided by/with the covariance object. You can try other values of optimMethod corresponding to the possible choice of the "algorithm" element in the opts argument of nloptr:nloptr. It may be useful to give other options in order to control the optimization and its stopping rule.

  • When optimFun is "stats:optim", it is assumed that we are maximizing the log-likelihood \(\log L\). We suggest to use one of the methods "L-BFGS-B" or "BFGS". Notice that control can be provided in ..., but control$fnscale is forced to be - 1, corresponding to maximization. Note that "L-BFGS-B" uses box constraints, but the optimization stops as soon as the log-likelihood is non-finite or NA. The method "BFGS" does not use constraints but allows the log-likelihood to be non-finite or NA. Both methods can be used without gradient or with gradient if object allows this.

The vectors parCovIni, parCovLower, parCovUpper must have elements corresponding to those of the vector of kernel parameters given by coef(object). These vectors should have suitably named elements.

See Also

gp for various examples, optimMethods to see the possible values of the argument optimMethod.

Examples

Run this code

set.seed(29770)

##=======================================================================
## Example. A 4-dimensional "covMan" kernel
##=======================================================================
d <- 4
myCovMan <- 
      covMan(
         kernel = function(x1, x2, par) { 
         htilde <- (x1 - x2) / par[1]
         SS2 <- sum(htilde^2)
         d2 <- exp(-SS2)
         kern <- par[2] * d2
         d1 <- 2 * kern * SS2 / par[1]            
         attr(kern, "gradient") <- c(theta = d1,  sigma2 = d2)
         return(kern)
      },
      label = "myGauss",
      hasGrad = TRUE,
      d = 4,    
      parLower = c(theta = 0.0, sigma2 = 0.0),
      parUpper = c(theta = +Inf, sigma2 = +Inf),
      parNames = c("theta", "sigma2"),
      par = c(NA, NA)
      )
kernCode <- "
       SEXP kern, dkern;
       int nprotect = 0, d;
       double SS2 = 0.0, d2, z, *rkern, *rdkern;

       d = LENGTH(x1);
       PROTECT(kern = allocVector(REALSXP, 1)); nprotect++;
       PROTECT(dkern = allocVector(REALSXP, 2)); nprotect++;
       rkern = REAL(kern);
       rdkern = REAL(dkern);

       for (int i = 0; i < d; i++) {
         z = ( REAL(x1)[i] - REAL(x2)[i] ) / REAL(par)[0];
         SS2 += z * z; 
       }

       d2 = exp(-SS2);
       rkern[0] = REAL(par)[1] * d2;
       rdkern[1] =  d2; 
       rdkern[0] =  2 * rkern[0] * SS2 / REAL(par)[0];

       SET_ATTR(kern, install(\"gradient\"), dkern);
       UNPROTECT(nprotect);
       return kern;
     "

## inline the C function into an R function: MUCH MORE EFFICIENT!!!
if (FALSE) {
require(inline)
kernC <- cfunction(sig = signature(x1 = "numeric", x2 = "numeric",
                                   par = "numeric"),
                    body = kernCode)
myCovMan <- covMan(kernel = kernC, hasGrad = TRUE, label = "myGauss", d = 4,
                   parNames = c("theta", "sigma2"),
                   parLower = c("theta" = 0.0, "sigma2" = 0.0),
                   parUpper = c("theta" = Inf, "sigma2" = Inf))
}

##=======================================================================
## Example (continued). Simulate data for covMan and trend
##=======================================================================
n <- 100; 
X <- matrix(runif(n * d), nrow = n)
colnames(X) <- inputNames(myCovMan)

coef(myCovMan) <- myPar <- c(theta = 0.5, sigma2 = 2)
C <- covMat(object = myCovMan, X = X,
            compGrad = FALSE,  index = 1L)

library(MASS)
set.seed(456)
y <- mvrnorm(mu = rep(0, n), Sigma = C)
p <- rpois(1, lambda = 4)
if (p > 0) {
  F <- matrix(runif(n * p), nrow = n, ncol = p)
  beta <- rnorm(p)
  y <- F %*% beta + y
} else F <- NULL
par <- parCovIni <- c("theta" = 0.6, "sigma2" = 4)

##=======================================================================
## Example (continued). ML estimation. Note the 'partrack' argument
##=======================================================================           
est <- mle(object = myCovMan,
           parCovIni = parCovIni,
           y = y, X = X, F = F,
           parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
           parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est$opt$value

## change the (constrained) optimization  method
if (FALSE) {
est1 <- mle(object = myCovMan,
            parCovIni = parCovIni,
            optimFun = "stats::optim",
            optimMethod = "L-BFGS-B",
            y = y, X = X, F = F,
            parCovLower = c(0.05, 0.05), parCovUpper = c(10, 100),
            parTrack = TRUE, noise = FALSE, checkNames = FALSE)
est1$opt$value
}

##=======================================================================
## Example (continued). Grid for graphical analysis
##=======================================================================
if (FALSE) {
    theta.grid <- seq(from = 0.1, to = 0.7, by = 0.2)
    sigma2.grid <- seq(from = 0.3, to = 6, by = 0.4)
    par.grid <- expand.grid(theta = theta.grid, sigma2 = sigma2.grid)
    ll <- apply(as.matrix(par.grid), 1, est$logLikFun)
    llmat <- matrix(ll, nrow = length(theta.grid),
                    ncol = length(sigma2.grid))
}                

##=======================================================================
## Example (continued). Explore the surface ?
##=======================================================================
if (FALSE) {
   require(rgl)
   persp3d(x = theta.grid, y = sigma2.grid, z = ll,
           xlab = "theta", ylab = "sigma2", zlab = "logLik",
           col = "SpringGreen3", alpha = 0.6)
}

##=======================================================================
## Example (continued). Draw a contour plot for the log-lik 
##                        and show iterates
##=======================================================================
if (FALSE) {
    contour(x = theta.grid, y = sigma2.grid, z = llmat,
            col = "SpringGreen3", xlab = "theta", ylab = "sigma2",
            main = "log-likelihood contours and iterates",
            xlim = range(theta.grid, est$parTracked[ , 1], na.rm = TRUE),
            ylim = range(sigma2.grid, est$parTracked[ , 2], na.rm = TRUE))
    abline(v = est$coef.kernel[1], h = est$coef.kernel[2], lty = "dotted")
    niter <- nrow(est$parTracked)
    points(est$parTracked[1:niter-1, ],
           col = "orangered", bg = "yellow", pch = 21, lwd = 2, type = "o")
    points(est$parTracked[niter, , drop = FALSE],
           col = "blue", bg = "blue", pch = 21, lwd = 2, type = "o", cex = 1.5)
    ann <- seq(from = 1, to = niter, by = 5)
    text(x = est$parTracked[ann, 1], y = est$parTracked[ann, 2],
         labels = ann - 1L, pos = 4, cex = 0.8, col = "orangered")
    points(x = myPar["theta"], y = myPar["sigma2"],
           bg = "Chartreuse3", col = "ForestGreen",
           pch = 22, lwd = 2, cex = 1.4)

    legend("topright", legend = c("optim", "optim (last)", "true"),
           pch = c(21, 21, 22), lwd = c(2, 2, 2), lty = c(1, 1, NA),
           col = c("orangered", "blue", "ForestGreen"),
           pt.bg = c("yellow", "blue", "Chartreuse3"))
 
}

Run the code above in your browser using DataLab