Learn R Programming

QuasiSeq (version 1.0-11-0)

mockRNASeqData: A Simulated RNA-Seq Data Set

Description

This is a simulated RNA-Seq data set using a negative binomial model with 10000 genes and 8 experimental unit, under a balanced two-treatment comparison design.

Usage

mockRNASeqData

Arguments

Format

This is a list with the following components:

counts

This is a numeric data matrix with 10000 rows and 8 columns, containing counts for each gene (row) and each experimental unit (column).

treatment

This is a factor with 2 levels, indicating the treatment group of each column of counts.

design.matrix

This is an example of design matrix corresponding to treatment.

true.normalization

This is a numeric vector of normalizing factors actually used to simulate the data matrix.

estimated.normalization

This is a numeric vector of normalizing factors estimated from the data matrix, using the so-called "TMM" method.

true.nbdisp

This is a numeric vector of negative binomial over-dispersion parameters actually used to simulate the data. This is using the parameterization such that true.nbdisp = 1/size, where size is the parameter used in rnbinom.

estimated.nbdisp

This is a numeric vector of estimated negative binomial over-dispersion parameters, using the "TrendedDisp" method from the edgeR package.

ngenes

Integer scalar 10000, the number of rows of counts.

nsamples

Integer scalar 8, the number of columns of counts.

true.DEgenes

An integer vector of length 3500, indicating the correct row indices of differentially expressed genes, i.e., rows whose means differ across the two treatments.

true.foldChanges

A numeric vector of length 3500, indicating the true ratio of means for each differentially expressed genes.

simulation.expression

This is a R expression that was used to simulate the mockRNASeqData data set itself. eval(mockRNASeqData$simulation.expression) should generate an identical data set, except for the simulation.expression component itself.