fit_model: Estimate mean and covariance parameters

Description

Given a response, set of locations, (optionally) a design matrix, and a specified covariance function, return the maximum Vecchia likelihood estimates, obtained with a Fisher scoring algorithm.

Usage

fit_model(
  y,
  locs,
  X = NULL,
  covfun_name = "matern_isotropic",
  NNarray = NULL,
  start_parms = NULL,
  reorder = TRUE,
  group = TRUE,
  m_seq = c(10, 30),
  max_iter = 40,
  fixed_parms = NULL,
  silent = FALSE,
  st_scale = NULL,
  convtol = 1e-04
)

Value

An object of class GpGp_fit, which is a list containing covariance parameter estimates, regression coefficients, covariance matrix for mean parameter estimates, as well as some other information relevant to the model fit.

Arguments

y: response vector
locs: matrix of locations. Each row is a single spatial or spatial-temporal location. If using one of the covariance functions for data on a sphere, the first column should be longitudes (-180,180) and the second column should be latitudes (-90,90). If using a spatial-temporal covariance function, the last column should contain the times.
X: design matrix. Each row contains covariates for the corresponding observation in y. If not specified, the function sets X to be a matrix with a single column of ones, that is, a constant mean function.
covfun_name: string name of a covariance function. See GpGp for information about supported covariance funtions.
NNarray: Optionally specified array of nearest neighbor indices, usually from the output of find_ordered_nn. If NULL, fit_model will compute the nearest neighbors. We recommend that the user not specify this unless there is a good reason to (e.g. if doing a comparison study where one wants to control NNarray across different approximations).
start_parms: Optionally specified starting values for parameters. If NULL, fit_model will select default starting values.
reorder: TRUE/FALSE indicating whether maxmin ordering should be used (TRUE) or whether no reordering should be done before fitting (FALSE). If you want to use a customized reordering, then manually reorder y, locs, and X, and then set reorder to FALSE. A random reordering is used when nrow(locs) > 1e5.
group: TRUE/FALSE for whether to use the grouped version of the approximation (Guinness, 2018) or not. The grouped version is used by default and is always recommended.
m_seq: Sequence of values for number of neighbors. By default, a 10-neighbor approximation is maximized, then a 30-neighbor approximation is maximized using the 10 neighbor estimates as starting values. However, one can specify any sequence of numbers of neighbors, e.g. m_seq = c(10,30,60,90).
max_iter: maximum number of Fisher scoring iterations
fixed_parms: Indices of covariance parameters you would like to fix at specific values. If you decide to fix any parameters, you must specify their values in start_parms, along with the starting values for all other parameters. For example, to fix the nugget at zero in exponential_isotropic, set fixed_parms to c(3), and set start_parms to c(4.7,3.1,0). The last element of start_parms (the nugget parameter) is set to zero, while the starting values for the other two parameters are 4.7 and 3.1.
silent: TRUE/FALSE for whether to print some information during fitting.
st_scale: Scaling for spatial and temporal ranges. Only applicable for spatial-temporal models, where it is used in distance calculations when selecting neighbors. st_scale must be specified when covfun_name is a spatial-temporal covariance. See Argo vignette for an example.
convtol: Tolerance for exiting the optimization. Fisher scoring is stopped when the dot product between the step and the gradient is less than convtol.

Details

fit_model is a user-friendly model fitting function that automatically performs many of the auxiliary tasks needed for using Vecchia's approximation, including reordering, computing nearest neighbors, grouping, and optimization. The likelihoods use a small penalty on small nuggets, large spatial variances, and small smoothness parameter.

The Jason-3 windspeed vignette and the Argo temperature vignette are useful sources for a use-cases of the fit_model function for data on sphere. The example below shows a very small example with a simulated dataset in 2d.

Examples

Run this code

n1 <- 20
n2 <- 20
n <- n1*n2
locs <- as.matrix( expand.grid( (1:n1)/n1, (1:n2)/n2 ) )
covparms <- c(2,0.1,1/2,0)
y <- 7 + fast_Gp_sim(covparms, "matern_isotropic", locs)
X <- as.matrix( rep(1,n) )
## not run
# fit <- fit_model(y, locs, X, "matern_isotropic")
# fit

Run the code above in your browser using DataLab