Learn R Programming

RaceID (version 0.3.9)

CCcorrect: Dimensional Reduction by PCA or ICA

Description

This functions performs dimensional reduction by PCA or ICA and removes components enriched for particular gene sets, e.g. cell cycle related genes genes associated with technical batch effects.

Usage

CCcorrect(
  object,
  vset = NULL,
  CGenes = NULL,
  ccor = 0.4,
  pvalue = 0.01,
  quant = 0.01,
  nComp = NULL,
  dimR = FALSE,
  mode = "pca",
  logscale = FALSE,
  FSelect = TRUE
)

Value

The function returns an updated SCseq object with the principal or independent component matrix written to the slot dimRed$x of the SCseq

object. Additional information on the PCA or ICA is stored in slot dimRed.

Arguments

object

SCseq class object.

vset

List of vectors with genes sets. The loadings of each component are tested for enrichment in any of these gene sets and if the lower quant or upper 1 - quant fraction of genes ordered by loading is enriched at a p-value < pvalue the component is discarded. Default is NULL.

CGenes

Vector of gene names. If this argument is given, gene sets to be tested for enrichment in PCA- or ICA-components are defined by all genes with a Pearson's correlation of >ccor to a gene in CGenes. The loadings of each component are tested for enrichment in any of these gene sets and if the lower quant or upper 1 - quant fraction of genes ordered by loading is enriched at a p-value < pvalue the component is discarded. Default is NULL.

ccor

Positive number between 0 and 1. Correlation threshold used to detrmine correlating gene sets for all genes in CGenes. Default is 0.4.

pvalue

Positive number between 0 and 1. P-value cutoff for determining enriched components. See vset or CGenes. Default is 0.01.

quant

Positive number between 0 and 1. Upper and lower fraction of gene loadings used for determining enriched components. See vset or CGenes. Default is 0.01.

nComp

Number of PCA- or ICA-components to use. Default is NULL and the maximal number of components is computed.

dimR

logical. If TRUE, then the number of principal components to use for downstream analysis is derived from a saturation criterion. See function plotdimsat. Default is FALSE and all nComp components are used.

mode

"pca" or "ica" to perform either principal component analysis or independent component analysis. Default is pca.

logscale

logical. If TRUE data are log-transformed prior to PCA or ICA. Default is FALSE.

FSelect

logical. If TRUE, then PCA or ICA is performed on the filtered expression matrix using only the features stored in slotcluster$features as computed in the function filterdata. See FSelect for function filterdata. Default is TRUE.

Examples

Run this code
sc <- SCseq(intestinalDataSmall)
sc <- filterdata(sc)
sc <- CCcorrect(sc,dimR=TRUE,nComp=3)

Run the code above in your browser using DataLab