This is a simulated RNA-Seq data set using a negative binomial model with 10000 genes and 8 experimental unit, under a balanced two-treatment comparison design.
mockRNASeqData
This is a list with the following components:
This is a numeric data matrix with 10000 rows and 8 columns, containing counts for each gene (row) and each experimental unit (column).
This is a factor with 2 levels, indicating the treatment group of each column of counts
.
This is an example of design matrix corresponding to treatment
.
This is a numeric vector of normalizing factors actually used to simulate the data matrix.
This is a numeric vector of normalizing factors estimated from the data matrix, using the so-called "TMM" method.
This is a numeric vector of negative binomial over-dispersion parameters actually used to simulate the data. This is using the parameterization such that true.nbdisp = 1/size
, where size
is the parameter used in rnbinom
.
This is a numeric vector of estimated negative binomial over-dispersion parameters, using the "TrendedDisp" method from the edgeR package.
Integer scalar 10000, the number of rows of counts
.
Integer scalar 8, the number of columns of counts
.
An integer vector of length 3500, indicating the correct row indices of differentially expressed genes, i.e., rows whose means differ across the two treatments.
A numeric vector of length 3500, indicating the true ratio of means for each differentially expressed genes.
This is a R
expression that was used to simulate the mockRNASeqData
data set itself. eval(mockRNASeqData$simulation.expression)
should generate an identical data set, except for the simulation.expression
component itself.