Learn R Programming

TCC (version 1.12.1)

hypoData: A simulation dataset for comparing two-group tag count data, focusing on RNA-seq

Description

A simulation dataset, consisting of 1,000 rows (or genes) and 6 columns (or independent biological samples).

Usage

data(hypoData)

Arguments

Format

hypoData is a matrix of dimension 1,000 times 6.

Details

This package typically start the differential expression analysis with a count table matrix such as hypoData where each row indicates the gene (or transcript), each column indicates the sample (or library), and each cell indicates the number of counts to the gene in the sample. The first three columns are produced from biological replicates of, for example, Group 1 and the remaining columns are from Group 2; i.e., G1_rep1, G1_rep2, G1_rep3 vs. G2_rep1, G2_rep2, G2_rep3. This data is generated by the simulateReadCounts function with default parameter settings. The first 200 genes are differentially expressed in the two groups. Of these, the first 180 genes are expressed at a higher level in Group 1 (G1) and the remaining 20 genes are expressed at a higher level in G2. Accordingly, the 201-1000th genes are not differentially expressed (non-DEGs). The levels of differential expression (DE) are four-fold in both groups.

Examples

Run this code
# The 'hypoData' is generated by following commands.
tcc <- simulateReadCounts(Ngene = 1000, PDEG = 0.2,
                          DEG.assign = c(0.9, 0.1),
                          DEG.foldchange = c(4, 4),
                          replicates = c(3, 3))
hypoData <- tcc$count

Run the code above in your browser using DataLab