data(khan)
data.frame
of 306 rows and 64 columns.
The training dataset of 64 arrays and 306 gene expression values
data.frame
, of 306 rows and 25 columns.
The test dataset of 25 arrays and 306 genes expression values
vector
of 306 Image clone identifiers
corresponding to the rownames of \$train and \$test.
factor
with 4 levels "EWS",
"BL-NHL", "NB" and "RMS", which correspond to the four groups in
the \$train dataset
factor
with 5 levels "EWS",
"BL-NHL", "NB", "RMS" and "Norm" which correspond to the five
groups in the \$test dataset
data.frame
of 306 rows and 8 columns.
This table contains further gene annotation retrieved from SOURCE
http://SOURCE.stanford.edu in May 2004. For each of the 306 genes,
it contains: khan
contains a filtered data of 2308 gene expression profiles
as published and provided by Khan et al. (2001) on the supplementary
web site to their publication
http://research.nhgri.nih.gov/microarray/Supplement/. Khan et al., 2001 used cDNA microarrays containing 6567 clones of which 3789 were known genes and 2778 were ESTs to study the expression of genes in of four types of small round blue cell tumours of childhood (SRBCT). These were neuroblastoma (NB), rhabdomyosarcoma (RMS), Burkitt lymphoma, a subset of non-Hodgkin lymphoma (BL), and the Ewing family of tumours (EWS). Gene expression profiles from both tumour biopsy and cell line samples were obtained and are contained in this dataset. The dataset downloaded from the website contained the filtered dataset of 2308 gene expression profiles as described by Khan et al., 2001. This dataset is available from the http://bioinf.ucd.ie/people/aedin/R/.
In order to reduce the size of the MADE4 package, and produce small example datasets, the top 50 genes from the
ends of 3 axes following bga
were selected. This produced a reduced datasets of 306 genes.