summariseExprsAcrossFeatures: Summarise expression values across feature

Description

Create a new SCESet with counts summarised at a different feature level. A typical use would be to summarise transcript-level counts at gene level.

Usage

summariseExprsAcrossFeatures(object, exprs_values = "tpm", summarise_by = "feature_id")

Arguments

object

an SCESet object.

exprs_values

character string indicating which slot of the assayData from the SCESet object should be used as expression values. Valid options are 'exprs' the expression slot, 'tpm' the transcripts-per-million slot or 'fpkm' the FPKM slot.

summarise_by

character string giving the column of fData(object) that will be used as the features for which summarised expression levels are to be produced. Default is 'feature_id'. "exprs".

Value

an SCESet object

Details

Only transcripts-per-million (TPM) and fragments per kilobase of exon per million reads mapped (FPKM) expression values should be aggregated across features. Since counts are not scaled by the length of the feature, expression in counts units are not comparable within a sample without adjusting for feature length. Thus, we cannot sum counts over a set of features to get the expression of that set (for example, we cannot sum counts over transcripts to get accurate expression estimates for a gene). See the following link for a discussion of RNA-seq expression units by Harold Pimentel: https://haroldpimentel.wordpress.com/2014/05/08/what-the-fpkm-a-review-rna-seq-expression-units/.

Examples

Run this code

data("sc_example_counts")
data("sc_example_cell_info")
pd <- new("AnnotatedDataFrame", data = sc_example_cell_info)
example_sceset <- newSCESet(countData = sc_example_counts, phenoData = pd)
fd <- new("AnnotatedDataFrame", data = 
data.frame(gene_id = featureNames(example_sceset), 
feature_id = paste("feature", rep(1:500, each = 4), sep = "_")))
fData(example_sceset) <- fd
example_sceset_summarised <- 
summariseExprsAcrossFeatures(example_sceset, exprs_values = "counts")
example_sceset_summarised <- 
summariseExprsAcrossFeatures(example_sceset, exprs_values = "exprs")

Run the code above in your browser using DataLab