This functions performs dimensional reduction by PCA or ICA and removes components enriched for particular gene sets, e.g. cell cycle related genes genes associated with technical batch effects.
CCcorrect(
object,
vset = NULL,
CGenes = NULL,
ccor = 0.4,
pvalue = 0.01,
quant = 0.01,
nComp = NULL,
dimR = FALSE,
mode = "pca",
logscale = FALSE,
FSelect = TRUE
)
The function returns an updated SCseq
object with the principal or independent component matrix written to the slot dimRed$x
of the SCseq
object. Additional information on the PCA or ICA is stored in slot dimRed
.
SCseq
class object.
List of vectors with genes sets. The loadings of each component are tested for enrichment in any of these gene sets and if the lower quant
or upper 1 - quant
fraction of genes ordered by loading is enriched at a p-value < pvalue
the component is discarded. Default is NULL
.
Vector of gene names. If this argument is given, gene sets to be tested for enrichment in PCA- or ICA-components are defined by all genes with a Pearson's correlation of >ccor
to a gene in CGenes
. The loadings of each component are tested for enrichment in any of these gene sets and if the lower quant
or upper 1 - quant
fraction of genes ordered by loading is enriched at a p-value < pvalue
the component is discarded. Default is NULL
.
Positive number between 0 and 1. Correlation threshold used to detrmine correlating gene sets for all genes in CGenes
. Default is 0.4.
Positive number between 0 and 1. P-value cutoff for determining enriched components. See vset
or CGenes
. Default is 0.01.
Positive number between 0 and 1. Upper and lower fraction of gene loadings used for determining enriched components. See vset
or CGenes
.
Default is 0.01.
Number of PCA- or ICA-components to use. Default is NULL
and the maximal number of components is computed.
logical. If TRUE
, then the number of principal components to use for downstream analysis is derived from a saturation criterion.
See function plotdimsat
. Default is FALSE
and all nComp
components are used.
"pca"
or "ica"
to perform either principal component analysis or independent component analysis. Default is pca
.
logical. If TRUE
data are log-transformed prior to PCA or ICA. Default is FALSE
.
logical. If TRUE
, then PCA or ICA is performed on the filtered expression matrix using only the features stored in slotcluster$features
as computed in the function filterdata
. See FSelect
for function filterdata
. Default is TRUE
.
sc <- SCseq(intestinalDataSmall)
sc <- filterdata(sc)
sc <- CCcorrect(sc,dimR=TRUE,nComp=3)
Run the code above in your browser using DataLab