Learn R Programming

scater (version 1.0.4)

findImportantPCs: Find most important principal components for a given variable

Description

Find most important principal components for a given variable

Usage

findImportantPCs(object, variable = "total_features", plot_type = "pcs-vs-vars", exprs_values = "exprs", ntop = 500, feature_set = NULL, scale_features = TRUE, theme_size = 10)

Arguments

object
an SCESet object containing expression values and experimental information. Must have been appropriately prepared.
variable
character scalar providing a variable name (column from pData(object)) for which to determine the most important PCs.
plot_type
character string, indicating which type of plot to produce. Default, "pairs-pcs" produces a pairs plot for the top 5 PCs based on their R-squared with the variable of interest. A value of "pcs-vs-vars" produces plots of the top PCs against the variable of interest.
exprs_values
which slot of the assayData in the object should be used to define expression? Valid options are "counts" (default), "tpm", "fpkm" and "exprs", or anything else in the object added manually by the user.
ntop
numeric scalar indicating the number of most variable features to use for the PCA. Default is 500, but any ntop argument is overrided if the feature_set argument is non-NULL.
feature_set
character, numeric or logical vector indicating a set of features to use for the PCA. If character, entries must all be in featureNames(object). If numeric, values are taken to be indices for features. If logical, vector is used to index features and should have length equal to nrow(object).
scale_features
logical, should the expression values be standardised so that each feature has unit variance? Default is TRUE.
theme_size
numeric scalar providing base font size for ggplot theme.

Value

a ggplot plot object

Details

Plot the top 5 or 6 most important PCs (depending on the plot_type argument for a given variable. Importance here is defined as the R-squared value from a linear model regressing each PC onto the variable of interest.

Examples

Run this code
data("sc_example_counts")
data("sc_example_cell_info")
pd <- new("AnnotatedDataFrame", data = sc_example_cell_info)
rownames(pd) <- pd$Cell
example_sceset <- newSCESet(countData = sc_example_counts, phenoData = pd)
drop_genes <- apply(exprs(example_sceset), 1, function(x) {var(x) == 0})
example_sceset <- example_sceset[!drop_genes, ]
example_sceset <- calculateQCMetrics(example_sceset)
findImportantPCs(example_sceset, variable="total_features")

Run the code above in your browser using DataLab