Learn R Programming

EDASeq (version 2.6.2)

withinLaneNormalization-methods: Methods for Function withinLaneNormalization in Package EDASeq

Description

Within-lane normalization for GC-content (or other lane-specific) bias.

Usage

withinLaneNormalization(x, y, which=c("loess","median","upper","full"), offset=FALSE, num.bins=10, round=TRUE)

Arguments

x
A numeric matrix representing the counts or a SeqExpressionSet object.
y
A numeric vector representing the covariate to normalize for (if x is a matrix) or a character vector with the name of the covariate (if x is a SeqExpressionSet object). Usually it is the GC-content.
which
Method used to normalized. See the details section and the reference below for details.
offset
Should the normalized value be returned as an offset leaving the original counts unchanged?
num.bins
The number of bins used to stratify the covariate for median, upper and full methods. Ignored if loess. See the reference for a discussion on the number of bins.
round
If TRUE the normalization returns rounded values (pseudo-counts). Ignored if offset=TRUE.

Methods

signature(x = "matrix", y = "numeric")
It returns a matrix with the normalized counts if offset=FALSE or with the offset if offset=TRUE.
signature(x = "SeqExpressionSet", y = "character")
It returns a SeqExpressionSet with the normalized counts in the normalizedCounts slot and with the offset in the offset slot (if offset=TRUE).

Details

This method implements four normalizations described in Risso et al. (2011).

The loess normalization transforms the data by regressing the counts on y and subtracting the loess fit from the counts to remove the dependence.

The median, upper and full normalizations are based on the stratification of the genes based on y. Once the genes are stratified in num.bins strata, the methods work as follows.

median:
scales the data to have the same median in each bin.

upper:
the same but with the upper quartile.

full:
forces the distribution of each stratum to be the same using a non linear full quantile normalization, in the spirit of the one used in microarrays.

References

D. Risso, K. Schwartz, G. Sherlock and S. Dudoit (2011). GC-Content Normalization for RNA-Seq Data. Manuscript in Preparation.

Examples

Run this code
library(yeastRNASeq)
data(geneLevelData)
data(yeastGC)

sub <- intersect(rownames(geneLevelData), names(yeastGC))

mat <- as.matrix(geneLevelData[sub, ])

data <- newSeqExpressionSet(mat,
                            phenoData=AnnotatedDataFrame(
                                      data.frame(conditions=factor(c("mut", "mut", "wt", "wt")),
                                                 row.names=colnames(geneLevelData))),
                            featureData=AnnotatedDataFrame(data.frame(gc=yeastGC[sub])))

norm <- withinLaneNormalization(data, "gc", which="full", offset=FALSE)

Run the code above in your browser using DataLab