Learn R Programming

EDR (version 0.6-7)

edrcv: Risk assessment by Cross-Validation

Description

Tis function, additionally to estimating the effective dimension reduction space (EDR), see also function edr, estimates the Mean Squared Error of Prediction (MSEP) and the Mean Absolute Error of Prediction (MAEP) when using the estimated EDR by Cross-Validation. Estimates of the regression function are produced using function sm.regression from package sm.

Usage

edrcv(x, y, m = 2, rho0 = 1, h0 = NULL, ch = exp(0.5/max(4, (dim(x)[2]))), crhomin = 1, 
      cm = 4, method = "Penalized",fit = "sm" , basis = "Quadratic", cw = NULL, 
      graph = FALSE, show = 1, trace = FALSE, seed = 1, cvsize = 1, m0 = min(m, 2), 
      hsm = NULL)

Arguments

x

x specifies the design matrix, dimension (n,d)

y

y specifies the response, length n.

m

Rank of matrix M in case of method="Penalized", not used for the other methods.

rho0

Initial value for the regularization parameter \(\rho\).

h0

Initial bandwidth.

ch

Factor for indecreasing \(h\) with iterations.

crhomin

Factor to in(de)crease the default value of rhomin. This is just added to explore properties of the algorithms. Defaults to 1.

cm

Factor in the definition of \(\Pi_k=C_m*\rho_k^2 I_L + \hat{M}_{k-1}\). Only used if method="Penalized".

method

Secifies the algoritm to use. The default method="Penalized" corresponds to the algoritm proposed in ... (2006). method="HJPS" corresponds to the original algorithm from Hristache et.al. (2001) while method="HJPS2" specifies a modifification (correction) of this algoritm.

fit

Specifies the method for estimating and predicting values of the link function. This can either be fit="sm" specifying use of the sm package or fit="direct" specifying the use of a local linear smoother. In case of m0>2 fit="direct" is used due to restrictions in the sm package.

basis

Specifies the set of basis functions. Options are basis="Quadratic" (default) and basis="Linear".

cw

cw another regularization parameter, secures identifiability of a minimum number of local gradient directions. Defaults to 1/d . Has to be positive or NULL.

graph

If graph==TRUE intermediate results are plotted.

show

If graph==TRUE the parameter show determines the dimension of the EDR that is to be used when plotting intermediate results. If trace=TRUE and !is.null(R) it determines the dimension of the EDR when computing the risk values.

trace

trace=TRUE additional diagnostics are provided for each iteration. This includes current, at iteration \(k\), values of the regularization parameter \(\rho_k\) and bandwidth \(h_k\), normalized cimmulative sums of eigenvalues of \(\hat{B}\) and if !is.null(R) two distances between the true, specified in \(R\) and estimated EDR.

seed

Seed for generating random groups for CV

cvsize

Groupsize k in leave-k-out CV

m0

Dimension of the dimension reduction space to use when fitting the data. Should be either 1 or 2.

hsm

If is.null(hsm) the bandwidth used by sm.regression for smoothing within the EDR is chosen by cross-validation within sm.regression when needed. Alternatively a grid of bandwidths may be specified. In that case a bandwidth for sm.regression is chosen from the grid that minimizes the extimated mean absolute error of prediction.

Value

Object of class "edr" with components.

x

The design matrix.

y

The values of the response.

bhat

Matrix \(\hat{B}\) characterizing the effective dimension space. For a specified dimension m \(\hat{B}_m = \hat{B} O_m\), with \(\hat{B}^T \hat{B}= O \Lambda O^T\) being the eigenvalue decomposition of \(\hat{B}^T \hat{B}\), specifies the projection to the m-dimensional subspace that provides the best approximation.

fhat

an highly oversmoothed estimate of the values of the regression function at the design points. This is provided as a backup only for the case that package sm is not installed.

cumlam

Cummulative amount of information explained by the first components of \(\hat{B}\).

nmean

Mean numbers of observations used in each iteration.

h

Final bandwidth

rho

Final value of \(\rho\)

h0

Initial bandwidth

rho0

Initial value of \(\rho\)

cm

The factor cm

call

Arguments of the call to edrcv

cvres

Residuals from cross-validation.

cvmseofh

Estimates of MSEP for bandwidths hsm

cvmaeofh

Estimates of MAEP for bandwidths hsm

cvmse

Estimate of MSEP

cvmae

Estimate of MAEP

hsm

Set of bandwidths specified for use with sm.regression

hsmopt

Bandwidth selected for use with sm.regression if hsm was specified.

Details

This function performs a leave-k-out cross-validation to estimate the risk in terms of Mean Squared Error of Prediction (MSEP) and Mean Absolute Error of Prediction (MAEP) when using function edr to estimate an effective dimension reduction space of dimension m0 and using this estimated space to predict values of the response. Smoothing within the dimension reduction space is performed using the function sm.regression from package sm. The bandwidth for sm.regression is chosen by Cross-Validation.

References

M. Hristache, A. Juditsky, J. Polzehl and V. Spokoiny (2001). Structure adaptive approach for dimension reduction, The Annals of Statistics. Vol.29, pp. 1537-1566.

J. Polzehl, S. Sperlich (2008). A Note on Stuctural Adaptive Dimension Reduction, Journal of Statistical Computation and Simulation, DOI: 10.1080/00949650801959699

See Also

edr,plot.edr, summary.edr, print.edr, edr.R, predict.edr

Examples

Run this code
# NOT RUN {
require(EDR)
# }
# NOT RUN {
demo(edr_ex4)
# }

Run the code above in your browser using DataLab