Learn R Programming

mixOmics (version 5.0-4)

wrapper.sgcca: mixOmics wrapper for Sparse Generalised Canonical Correlation Analysis (sgcca)

Description

Wrapper function to perform Sparse Generalised Canonical Correlation Analysis (sGCCA), a generalised approach for the integration of multiple datasets. For more details, see the help(sgcca) from the RGCCA package.

Usage

wrapper.sgcca(data, design = 1 - diag(length(data)), 
penalty = rep(1, length(data)),
  ncomp = rep(1, length(data)), 
  scheme = "centroid", 
  scale = TRUE, 
  init = "svd", 
  bias = TRUE, 
  tol = .Machine$double.eps, 
  verbose = FALSE)

Arguments

data
a list of data sets (called 'blocks') matching on the same samples. Data in the list should be arranged in samples x variables. NAs are not allowed.
design
numeric matrix of size (number of blocks) x (number of blocks) with only 0 or 1 values. A value of 1 (0) indicates a relationship (no relationship) between the blocks to be modelled using sGCCA.
penalty
numeric vector of length the number of blocks in data. Each penalty parameter will be applied on each block and takes the value between 0 (no variable selected) and 1 (all variables included).
ncomp
numeric vector of length the number of blocks in data. The number of components to include in the model for each block (does not necessarily takes the same value for each block).
scheme
Either "horst", "factorial" or "centroid" (Default: "centroid").
scale
If scale = TRUE, each block is standardized to zero means and unit variances (default: TRUE)
init
Mode of initialization use in the SGCCA algorithm, either by Singular Value Decompostion ("svd") or random ("random") (default : "svd").
bias
A logical value for biaised or unbiaised estimator of the var/cov (defaults to TRUE).
tol
Convergence stopping value.
verbose
if set to TRUE, reports progress on computing.

Value

  • wrapper.sgcca returns an object of class "sgcca", a list that contains the following components:
  • datathe input data set (as a list).
  • designthe input design.
  • variatesthe sgcca components.
  • loadingsthe loadings for each block data set (outer wieght vector).
  • loadings.starthe laodings, standardised.
  • penaltythe input penalty parameter.
  • schemethe input schme.
  • ncompthe number of components on each block.
  • critthe convergence criterion.
  • AVEIndicators of model quality based on the Average Variance Explained (AVE): AVE(for one block), AVE(outer model), AVE(inner model)..
  • nameslist containing the names to be used for individuals and variables.
  • More details can be found in the references.

encoding

latin1

Details

This wrapper function performs sGCCA (see RGCCA) with $1, \ldots ,$ncomp components on each block data set. A supervised or unsupervised model can be run. For a supervised model, the unmap function should be used as an input data set. More details can be found on the package RGCCA.

References

Tenenhaus A. and Tenenhaus M., (2011), Regularized Generalized Canonical Correlation Analysis, Psychometrika, Vol. 76, Nr 2, pp 257-284. Tenenhaus A., Phillipe C., Guillemot, V., Le Cao K-A., Grill J., Frouin, V. Variable Selection For Generalized Canonical Correlation Analysis. 2013. (in revision)

See Also

sgcca, plotIndiv, plotVar, wrapper.rgcca, rgcca and http://www.mixOmics.org for more details.

Examples

Run this code
data(nutrimouse)
# need to unmap the Y factor diet
Y = unmap(nutrimouse$diet)
data = list(nutrimouse$gene, nutrimouse$lipid,Y)
# with this design, gene expression and lipids are connected to the diet factor
# design = matrix(c(0,0,1,
#                   0,0,1,
#                   1,1,0), ncol = 3, nrow = 3, byrow = TRUE)

# with this design, gene expression and lipids are connected to the diet factor
# and gene expression and lipids are also connected
design = matrix(c(0,1,1,
                  1,0,1,
                  1,1,0), ncol = 3, nrow = 3, byrow = TRUE)

#note: the penalty parameters will need to be tuned
wrap.result.sgcca = wrapper.sgcca(data = data, design = design, penalty = c(.3,.5, 1), 
                                  ncomp = c(2, 2, 1),
                                  scheme = "centroid", verbose = FALSE)
wrap.result.sgcca
#did the algo converge?
wrap.result.sgcca$crit  # yes

Run the code above in your browser using DataLab