funGP: Function comparison using Gaussian Process and Hypothesis testing

Description

Function comparison using Gaussian Process and Hypothesis testing

Usage

funGP(
  datalist,
  xCol,
  yCol,
  confLevel = 0.95,
  testset,
  limitMemory = TRUE,
  opt_method = "nlminb",
  sampleSize = list(optimSize = 500, bandSize = 5000),
  rngSeed = 1
)

Value

a list containing :

muDiff - A vector of pointwise difference between the predictions from the two datasets (mu2- mu1)
mu1 - A vector of test prediction for first data set
mu2 - A vector of test prediction for second data set
band - A vector of the allowed statistical difference between functions at testpoints in testset
confLevel - A numeric representing the statistical significance level for constructing the band
testset - A matrix of test points to compare the functions
estimatedParams - A list of estimated hyperparameters for GP

Arguments

datalist: A list of data sets to compute a function for each of them
xCol: A numeric or vector stating the column number of covariates
yCol: A numeric value stating the column number of target
confLevel: A single value representing the statistical significance level for constructing the band
testset: Test points at which the functions will be compared
limitMemory: A boolean (True/False) indicating whether to limit the memory use or not. Default is true. If set to true, 5000 datapoints are randomly sampled from each dataset under comparison for inference.
opt_method: A string specifying the optimization method to be used for hyperparameter estimation. Current options are: 'L-BFGS-B', 'BFGS', and 'nlminb'. Default is set to 'nlminb'.
sampleSize: A named list of two integer items: optimSize and bandSize, denoting the sample size for each dataset for hyperparameter optimization and confidence band computation, respectively, when limitMemory = TRUE. Default value is list(optimSize = 500, bandSize = 5000).
rngSeed: Random seed for sampling data when limitMemory = TRUE. Default is 1.

References

Prakash, A., Tuo, R., & Ding, Y. (2022). "Gaussian process aided function comparison using noisy scattered data," Technometrics, Vol. 64, No. 1, pp. 92-102, tools:::Rd_expr_doi("10.1080/00401706.2021.1905073").

Examples

Run this code


datalist = list(data1[1:50,], data2[1:50, ])
xCol = 2
yCol = 7
confLevel = 0.95
testset = seq(4,10,length.out = 10)
function_diff = funGP(datalist, xCol, yCol, confLevel, testset)

Run the code above in your browser using DataLab