Learn R Programming

refund (version 0.1-37)

fpca.face: Functional principal component analysis with fast covariance estimation

Description

A fast implementation of the sandwich smoother (Xiao et al., 2013) for covariance matrix smoothing. Pooled generalized cross validation at the data level is used for selecting the smoothing parameter.

Usage

fpca.face(
  Y = NULL,
  ydata = NULL,
  Y.pred = NULL,
  argvals = NULL,
  pve = 0.99,
  npc = NULL,
  var = FALSE,
  simul = FALSE,
  sim.alpha = 0.95,
  center = TRUE,
  knots = 35,
  p = 3,
  m = 2,
  lambda = NULL,
  alpha = 1,
  search.grid = TRUE,
  search.length = 100,
  method = "L-BFGS-B",
  lower = -20,
  upper = 20,
  control = NULL,
  periodicity = FALSE
)

Value

A list with components

  1. Yhat - If Y.pred is specified, the smooth version of Y.pred. Otherwise, if Y.pred=NULL, the smooth version of Y.

  2. scores - matrix of scores

  3. mu - mean function

  4. npc - number of principal components

  5. efunctions - matrix of eigenvectors

  6. evalues - vector of eigenvalues

  7. pve - The percent variance explained by the returned number of PCs

if var == TRUE additional components are returned

  1. sigma2 - estimate of the error variance

  2. VarMats - list of covariance function estimate for each subject

  3. diag.var - matrix containing the diagonals of each matrix in

  4. crit.val - list of estimated quantiles; only returned if simul == TRUE

Arguments

Y, ydata

the user must supply either Y, a matrix of functions observed on a regular grid, or a data frame ydata representing irregularly observed functions. See Details.

Y.pred

if desired, a matrix of functions to be approximated using the FPC decomposition.

argvals

numeric; function argument.

pve

proportion of variance explained: used to choose the number of principal components.

npc

how many smooth SVs to try to extract, if NA (the default) the hard thresholding rule of Gavish and Donoho (2014) is used (see Details, References).

var

logical; should an estimate of standard error be returned?

simul

logical; if TRUE curves will we simulated using Monte Carlo to obtain an estimate of the sim.alpha quantile at each argval; ignored if var == FALSE

sim.alpha

numeric; if simul==TRUE, quantile to estimate at each argval; ignored if var == FALSE

center

logical; center Y so that its column-means are 0? Defaults to TRUE

knots

number of knots to use or the vectors of knots; defaults to 35

p

integer; the degree of B-splines functions to use

m

integer; the order of difference penalty to use

lambda

smoothing parameter; if not specified smoothing parameter is chosen using optim or a grid search

alpha

numeric; tuning parameter for GCV; see parameter gamma in gam

search.grid

logical; should a grid search be used to find lambda? Otherwise, optim is used

search.length

integer; length of grid to use for grid search for lambda; ignored if search.grid is FALSE

method

method to use; see optim

lower

see optim

upper

see optim

control

see optim

periodicity

Option for a periodic spline basis. Defaults to FALSE.

Author

Luo Xiao

References

Xiao, L., Li, Y., and Ruppert, D. (2013). Fast bivariate P-splines: the sandwich smoother, Journal of the Royal Statistical Society: Series B, 75(3), 577-599.

Xiao, L., Ruppert, D., Zipunnikov, V., and Crainiceanu, C. (2016). Fast covariance estimation for high-dimensional functional data. Statistics and Computing, 26, 409-421. DOI: 10.1007/s11222-014-9485-x.

See Also

fpca.sc for another covariance-estimate based smoothing of Y; fpca2s and fpca.ssvd for two SVD-based smoothings.

Examples

Run this code
#### settings
I <- 50 # number of subjects
J <- 3000 # dimension of the data
t <- (1:J)/J # a regular grid on [0,1]
N <- 4 #number of eigenfunctions
sigma <- 2 ##standard deviation of random noises
lambdaTrue <- c(1,0.5,0.5^2,0.5^3) # True eigenvalues

case = 1
### True Eigenfunctions

if(case==1) phi <- sqrt(2)*cbind(sin(2*pi*t),cos(2*pi*t),
                                sin(4*pi*t),cos(4*pi*t))
if(case==2) phi <- cbind(rep(1,J),sqrt(3)*(2*t-1),
                          sqrt(5)*(6*t^2-6*t+1),
                         sqrt(7)*(20*t^3-30*t^2+12*t-1))

###################################################
########     Generate Data            #############
###################################################
xi <- matrix(rnorm(I*N),I,N);
xi <- xi %*% diag(sqrt(lambdaTrue))
X <- xi %*% t(phi); # of size I by J
Y <- X + sigma*matrix(rnorm(I*J),I,J)

results <- fpca.face(Y,center = TRUE, argvals=t,knots=100,pve=0.99)

# calculate percent variance explained by each PC
 evalues = results$evalues
 pve_vec = evalues * results$npc/sum(evalues)

###################################################
####               FACE                ########
###################################################
Phi <- results$efunctions
eigenvalues <- results$evalues

for(k in 1:N){
  if(Phi[,k] %*% phi[,k]< 0)
    Phi[,k] <- - Phi[,k]
}

### plot eigenfunctions
par(mfrow=c(N/2,2))
seq <- (1:(J/10))*10
for(k in 1:N){
  plot(t[seq],Phi[seq,k]*sqrt(J),type="l",lwd = 3,
       ylim = c(-2,2),col = "red",
       ylab = paste("Eigenfunction ",k,sep=""),
       xlab="t",main="FACE")

  lines(t[seq],phi[seq,k],lwd = 2, col = "black")
}

Run the code above in your browser using DataLab