Learn R Programming

DSWE (version 1.8.2)

tempGP: temporal Gaussian process

Description

A Gaussian process based power curve model which explicitly models the temporal aspect of the power curve. The model consists of two parts: f(x) and g(t).

Usage

tempGP(
  trainX,
  trainY,
  trainT = NULL,
  fast_computation = TRUE,
  limit_memory = 5000L,
  max_thinning_number = 20L,
  vecchia = TRUE,
  optim_control = list(batch_size = 100L, learn_rate = 0.05, max_iter = 5000L, tol =
    1e-06, beta1 = 0.9, beta2 = 0.999, epsilon = 1e-08, logfile = NULL)
)

Value

An object of class tempGP with the following attributes:

  • trainX - same as the input matrix trainX.

  • trainY - same as the input vector trainY.

  • thinningNumber - the thinning number computed by the algorithm.

  • modelF - A list containing the details of the model for predicting function f(x):

    • X - The input variable matrix for computing the cross-covariance for predictions, same as trainX unless the model is updated. See updateData.tempGP method for details on updating the model.

    • y - The response vector, again same as trainY unless the model is updated.

    • weightedY - The weighted response, that is, the response left multiplied by the inverse of the covariance matrix.

  • modelG - A list containing the details of the model for predicting function g(t):

    • residuals - The residuals after subtracting function f(x) from the response. Used to predict g(t). See updateData.tempGP method for updating the residuals.

    • time_index - The time indices of the residuals, same as trainT.

  • estimatedParams - Estimated hyperparameters for function f(x).

  • llval - log-likelihood value of the hyperparameter optimization for f(x).

  • gradval - gradient vector at the optimal log-likelihood value.

Arguments

trainX

A matrix with each column corresponding to one input variable.

trainY

A vector with each element corresponding to the output at the corresponding row of trainX.

trainT

A vector for time indices of the data points. By default, the function assigns natural numbers starting from 1 as the time indices.

fast_computation

A Boolean that specifies whether to do exact inference or fast approximation. Default is TRUE.

limit_memory

An integer or NULL. The integer is used sample training points during prediction to limit the total memory requirement. Setting the value to NULL would result in no sampling, that is, full training data is used for prediction. Default value is 5000.

max_thinning_number

An integer specifying the max lag to compute the thinning number. If the PACF does not become insignificant till max_thinning_number, then max_thinning_number is used for thinning.

vecchia

A Boolean that specifies whether to do exact inference or vecchia approximation. Default is TRUE.

optim_control

A list parameters passed to the Adam optimizer when fast_computation is set to TRUE. The default values have been tested rigorously and tend to strike a balance between accuracy and speed.

  • batch_size: Number of training points sampled at each iteration of Adam.

  • learn_rate: The step size for the Adam optimizer.

  • max_iter: The maximum number of iterations to be performed by Adam.

  • tol: Gradient tolerance.

  • beta1: Decay rate for the first moment of the gradient.

  • beta2: Decay rate for the second moment of the gradient.

  • epsilon: A small number to avoid division by zero.

  • logfile: A string specifying a file name to store hyperparameters value for each iteration.

References

Prakash, A., Tuo, R., & Ding, Y. (2022). "The temporal overfitting problem with applications in wind power curve modeling." Technometrics. tools:::Rd_expr_doi("10.1080/00401706.2022.2069158").

Katzfuss, M., & Guinness, J. (2021). "A General Framework for Vecchia Approximations of Gaussian Processes." Statistical Science. tools:::Rd_expr_doi("10.1214/19-STS755").

Guinness, J. (2018). "Permutation and Grouping Methods for Sharpening Gaussian Process Approximations." Technometrics. tools:::Rd_expr_doi("10.1080/00401706.2018.1437476").

See Also

predict.tempGP for computing predictions and updateData.tempGP for updating data in a tempGP object.

Examples

Run this code

    data = DSWE::data1
    trainindex = 1:50 #using the first 50 data points to train the model
    traindata = data[trainindex,]
    xCol = 2 #input variable columns
    yCol = 7 #response column
    trainX = as.matrix(traindata[,xCol])
    trainY = as.numeric(traindata[,yCol])
    tempGPObject = tempGP(trainX, trainY)


Run the code above in your browser using DataLab