Usage
CCA.permute(x,z,typex=c("standard", "ordered"),typez=c("standard","ordered"), penaltyxs=NULL, penaltyzs=NULL,
niter=3,v=NULL,trace=TRUE,nperms=25, standardize=TRUE, chromx=NULL,
chromz=NULL,upos=FALSE, uneg=FALSE, vpos=FALSE, vneg=FALSE,outcome=NULL, y=NULL, cens=NULL)
Arguments
x
Data matrix; samples are rows and columns are features.
z
Data matrix; samples are rows and columns are
features. Note that x and z must have the same number of rows, but
may (and generally will) have different numbers of columns.
typex
Are the columns of x unordered (type="standard") or
ordered (type="ordered")? If "standard", then a lasso penalty is
applied to v, to enforce sparsity. If "ordered" (generally used
for CGH data), then a fused lasso penalty is applied, to enf
typez
Are the columns of z unordered (type="standard") or
ordered (type="ordered")? If "standard", then a lasso penalty is
applied to v, to enforce sparsity. If "ordered" (generally used
for CGH data), then a fused lasso penalty is applied, to
penaltyxs
The set of x penalties to be considered. If
typex="standard", then the L1 bound on u is penaltyxs*sqrt(ncol(x)). If "ordered",
then it's the lambda for the fused lasso penalty. The user
can specify a single value or a
vector of values.
penaltyzs
The set of z penalties to be considered. If
typez="standard", then the L1 bound on v is penaltyzs*sqrt(ncol(z)). If "ordered",
then it's the lambda for the fused lasso penalty. The user
can specify a single value or a
vector of values.
niter
How many iterations should be performed each time CCA is
called? Default is 3, since an approximate estimate of u and v is
acceptable in this case, and otherwise this function can be quite time-consuming.
v
The first K columns of the v matrix of the SVD of X'Z. If
NULL, then the SVD of X'Z will be computed inside this function. However, if
you plan to run this function multiple times, then save a copy of
this argument so that it does not need to
nperms
How many times should the data be permuted? Default is
25. A large value of nperms is very important here, since the
formula for computing the z-statistics requires a standard deviation
estimate for the correlations obtained via permutation, w
standardize
Should the columns of X and Z be centered (to have mean
zero)
and scaled (to have standard deviation 1)? Default is TRUE.
chromx
Used only if typex="ordered"; a vector of length ncol(x)
that allows you to specify which chromosome each CGH spot is on. If
NULL, then it is assumed that all CGH spots are on same chromosome.
chromz
Used only if typex="ordered"; a vector of length ncol(z)
that allows you to specify which chromosome each CGH spot is on. If
NULL, then it is assumed that all CGH spots are on same chromosome.
upos
If TRUE, then require all elements of u to be positive in
sign. Default is FALSE. Can only be used if type is standard.
uneg
If TRUE, then require all elements of u to be negative in
sign. Default is FALSE. Can only be used if type is standard.
vpos
If TRUE, then require all
elements of v to be positive in sign. Default is FALSE. Can only be used if type is standard.
vneg
If TRUE, then require all
elements of v to be negative in sign. Default is FALSE. Can only be used if type is standard.
outcome
If you would like to incorporate a phenotype into CCA
analysis - that is, you wish to find features that are correlated
across the two data sets and also correlated
with a phenotype - then use one of "survival", "multiclass", or
"quantitat
y
If outcome is not NULL, then this is a vector of phenotypes -
one for each row of x and z. If outcome is "survival" then these are
survival times; must be non-negative. If outcome is "multiclass"
then these are class labels. Default NULL.
cens
If outcome is "survival" then these are censoring statuses
for each observation. 1 is complete, 0 is censored. Default NULL.