rcc(X,
Y,
ncomp = 2,
method = "ridge", #choose between c("ridge", "shrinkage")
lambda1 = 0,
lambda2 = 0)
NA
s are allowed.NA
s are allowed.lambda1
and lambda2
need to be supplied (see also our function tune.rcc); if "shrinkage", parameters are directly estimated with Strimmer's formula, see below and reference.lambda1=lambda2=0
. Only used if method="ridge"
rcc
returns a object of class "rcc"
, a list that
contains the following components:cancor
function performs the core of computations
but additional tools are required to deal with data sets highly
correlated (nearly collinear), data sets with more variables
than units by example.The rcc
function, the regularized version of CCA,
is one way to deal with this problem by
including a regularization step in the computations of CCA.
Such a regularization in this context
was first proposed by Vinod (1976), then developped by Leurgans et al. (1993).
It consists in the regularization of the empirical covariances matrices of
$X$ and $Y$ by adding a multiple of the matrix identity, that is,
Cov$(X)+ \lambda_1 I$ and Cov$(Y)+ \lambda_2 I$.
When lambda1=0
and lambda2=0
, rcc
performs a classical
CCA, if possible (i.e. when $n > p+q$.
The shrinkage estimates method = "shrinkage"
can be used to bypass tune.rcc
to choose the shrinkage parameters - which can be long and costly to compute with very large data sets. Note that both functions tune.rcc
(which uses cross-validation) and the whrinkage parameters (which uses the formula from Schafer and Strimmer) may output different results.
Note: when method = "shrinkage"
the input data are centered and scaled for the estimation of the shrinkage parameters and the calculation of the regularised variance-covariance matrices in rcc
.
The estimation of the missing values can be performed
by the reconstitution of the data matrix using the nipals
function. Otherwise, missing
values are handled by casewise deletion in the rcc
function.
Gonzalez, I., Dejean, S., Martin, P., Goncalves, O., Besse, P., and Baccini, A. (2009). Highlighting relationships between heterogeneous biological data through graphical displays based on regularized canonical correlation analysis. Journal of Biological Systems, 17(02), 173-199.
Leurgans, S. E., Moyeed, R. A. and Silverman, B. W. (1993). Canonical correlation analysis when the data are curves. Journal of the Royal Statistical Society. Series B 55, 725-740.
Vinod, H. D. (1976). Canonical ridge and econometrics of joint production. Journal of Econometrics 6, 129-137.
Opgen-Rhein, R., and K. Strimmer. 2007. Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Statist. emph{Appl. Genet. Mol. Biol.} 6:9. (http://www.bepress.com/sagmb/vol6/iss1/art9/)
Sch"afer, J., and K. Strimmer. 2005. A shrinkage approach to large-scale covariance estimation and implications for functional genomics. Statist. emph{Appl. Genet. Mol. Biol.} 4:32. (http://www.bepress.com/sagmb/vol4/iss1/art32/)
summary
, tune.rcc
,
plot.rcc
, plotIndiv
,
plotVar
, cim
, network
and http://www.mixOmics.org for more details.## Classic CCA
data(linnerud)
X <- linnerud$exercise
Y <- linnerud$physiological
linn.res <- rcc(X, Y)
## Regularized CCA
data(nutrimouse)
X <- nutrimouse$lipid
Y <- nutrimouse$gene
nutri.res1 <- rcc(X, Y, ncomp = 3, lambda1 = 0.064, lambda2 = 0.008)
## using shrinkage parameters
nutri.res2 <- rcc(X, Y, ncomp = 3, method = 'shrinkage')
nutri.res2$lambda # the shrinkage parameters
Run the code above in your browser using DataLab