Learn R Programming

dprep (version 3.0.2)

srbct: Khan et al.'s small round blood cells dataset

Description

The sbrct dataset which contains information on 63 samples and 2308 genes. The samples are distributed in four classes as follows: 8 Burkitt Lymphoma (BL), 23 Ewing Sarcoma (EWS), 12 neuroblastoma (NB), and 20 rhabdomyosarcoma (RMS). The last column contains the class labels.

Usage

data(srbct)

Arguments

Format

A data frame containing 63 observations with 2308 attributes each. The last column of the dat frame contains the class labels for each observation.

Source

The data set was obtained, as binary R file from Marcel Dettling's web site:

References

Javed Khan, Jun S. Wei, Markus Ringner, Lao H. Saal, Marc Ladanyi, Frank Westermann, Frank Berthold, Manfred Schwab, Cristina R. Antonescu, Carsten Peterson, and Paul S. Meltzer (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nature Medicine, Volume 7, Number 6, June

Examples

Run this code
#---z-score Normalization
data(srbct)
srbct.rnorm=rangenorm(srbct,"znorm")
#---feature selection using the RELIEF feature selection algorithm-----
#relief(srbct,63,0.12)

Run the code above in your browser using DataLab