Learn R Programming

refund.wave (version 0.1)

wcr: Principal component regression and partial least squares in the wavelet domain

Description

Performs generalized linear scalar-on-function or scalar-on-image regression in the wavelet domain, by sparse principal component regression (PCR) and sparse partial least squares (PLS).

Usage

wcr(y, xfuncs, min.scale, nfeatures, ncomp, method = c("pcr", "pls"), 
    mean.signal.term = FALSE, covt = NULL, filter.number = 10, 
    wavelet.family = "DaubLeAsymm", family = "gaussian", cv1 = FALSE, nfold = 5, 
    nsplit = 1, store.cv = FALSE, store.glm = FALSE, seed = NULL)

Arguments

y
scalar outcome vector.
xfuncs
functional predictors. For 1D predictors, an $n \times d$ matrix of signals, where $n$ is the length of y and $d$ is the number of sites at which each signal is defined. For 2D predictors, an $n \times d \times d$ array comprising $n$ images
min.scale
either a scalar, or a vector of values to be compared. Used to control the coarseness level of wavelet decomposition. Possible values are $0,1,\dots,log_2(d) - 1$.
nfeatures
number(s) of features, i.e. wavelet coefficients, to retain for prediction of y: either a scalar, or a vector of values to be compared.
ncomp
number(s) of principal components (if method="pcr") or PLS components (if method="pls"): either a scalar, or a vector of values to be compared.
method
either "pcr" (principal component regression) (the default) or "pls" (partial least squares).
mean.signal.term
logical: should the mean of each subject's signal be included as a covariate? By default, FALSE.
covt
covariates, if any: an $n$-row matrix, or a vector of length $n$.
filter.number
argument passed to function wd, imwd, or wd3D in the wavethresh package. Used
wavelet.family
family of wavelets: passed to functions wd, imwd, orwd3D.
family
generalized linear model family. Current version supports "gaussian" (the default) and "binomial".
cv1
logical: should cross-validation be performed (to estimate prediction error) even if a single value is provided for each of min.scale, nfeatures and ncomp? By default, FALSE. Note that whenever multiple
nfold
the number of validation sets ("folds") into which the data are divided.
nsplit
number of splits into nfold validation sets; CV is computed by averaging over these splits.
store.cv
logical: should the output include a CV result table?
store.glm
logical: should the output include the fitted glm?
seed
the seed for random data division. If seed = NULL, a random seed is used.

Value

  • An object of class "wcr". This is a list that, if store.glm = TRUE, includes all components of the fitted glm object. The following components are included even if store.glm = FALSE:
  • fitted.valuesthe fitted values.
  • param.coefcoefficients for covariates with decorrelation. The model is fitted after decorrelating the functional predictors from any scalar covariates; but for CV, one needs the "undecorrelated" coefficients from the training-set models.
  • undecor.coefcoefficients for covariates without decorrelation. See param.coef.
  • fhatcoefficient function estimate.
  • Rsqcoefficient of determination.
  • tuning.paramsif CV is performed, a $2 \times 4$ table giving the indices and values of min.scale, nfeatures and ncomp chosen by CV.
  • cv.tablea table giving the CV criterion for each combination of min.scale, nfeatures and ncomp, if store.cv = TRUE; otherwise, the CV criterion only for the optimized combination of these parameters. Set to NULL if CV is not performed.
  • se.cvif store.cv = TRUE, the standard error of the CV estimate for each combination of min.scale, nfeatures and ncomp.
  • familygeneralized linear model family.

Details

Briefly, the algorithm works by (1) applying the discrete wavelet transform (DWT) to the functional/image predictors; (2) retaining only the nfeatures wavelet coefficients having the highest variance (for PCR; cf. Johnstone and Lu, 2009) or highest covariance with y (for PLS); (3) regressing y on the leading ncomp PCs or PLS components, along with any scalar covariates; and (4) applying the inverse DWT to the result to obtain the coefficient function estimate fhat.

This function supports only the standard DWT (see argument type in wd) with periodic boundary handling (see argument bc in wd).

For 2D predictors, setting min.scale=1 will lead to an error, due to a technical detail regarding imwd. Please contact the author if a workaround is needed.

See the Details for fpcr in refund for a note regarding decorrelation.

References

Johnstone, I. M., and Lu, Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104, 682--693.

Reiss, P. T., Huo, L., Zhao, Y., Kelly, C., and Ogden, R. T. (2014). Wavelet-domain regression and predictive inference in psychiatric neuroimaging. Available at http://works.bepress.com/phil_reiss/29/

See Also

wnet

Examples

Run this code
# See example for wnet

Run the code above in your browser using DataLab