Learn R Programming

grpregOverlap (version 2.2-0)

pathway.dat: Gene expression and pathway information of p53 cancer cell lines

Description

The data set contains gene expression data and pathway (group) information of p53 cancer cell lines. The mutational status of the p53 gene for 50 cell lines is recorded, with 17 classified as normal and 33 as carrying mutations. Pathway information of the genes are from the C2 catalog of the Initial Catalog of Human Gene Sets, or MSigDB 1.0 (Subramanian et al., 2005).

Usage

data(pathway.dat)

Arguments

Format

The raw data files of gene expression and pathway information can be found via links in Source section below. The raw data is preprocessed such that only 308 pathways with size between 15 and 500 are included. Then 4301 genes in those selected pathways are chosen. A list of three variables included in pathway.dat:
  • expression a 50-by-4301 matrix that records the gene expression data. Used as design matrix.
  • mutation a 1-by-50 binary response vector recording the mutational status: 1 = normal; 0 = mutation. Used as response vector.
  • pathways a list of 308 vectors. Each contains the names of genes in that pathway. Used as group information.

Source

The raw data files can be downloaded via http://www.broadinstitute.org/gsea/datasets.jsp.

References

  • Subramanian, et al. (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America, 102(43), 15545-15550. http://www.pnas.org/content/102/43/15545.short

Examples

Run this code
data(pathway.dat)
pathway.dat$expression[1:10, 1:10]
pathway.dat$mutation
head(pathway.dat$pathways)

Run the code above in your browser using DataLab