Golub et al. (1999) data on gene expression profiles of 38 patients suffering from acute leukemia and a validation sample of 34 patients.
The expression data are available in data frame Golub
with
5327 observations on the following 73 variables.
a character vector with gene identifiers
gene expression data for sample 1
gene expression data for sample 2
gene expression data for sample 3
gene expression data for sample 4
gene expression data for sample 5
gene expression data for sample 6
gene expression data for sample 7
gene expression data for sample 8
gene expression data for sample 9
gene expression data for sample 10
gene expression data for sample 11
gene expression data for sample 12
gene expression data for sample 13
gene expression data for sample 14
gene expression data for sample 15
gene expression data for sample 16
gene expression data for sample 17
gene expression data for sample 18
gene expression data for sample 19
gene expression data for sample 20
gene expression data for sample 21
gene expression data for sample 22
gene expression data for sample 23
gene expression data for sample 24
gene expression data for sample 25
gene expression data for sample 26
gene expression data for sample 27
gene expression data for sample 34
gene expression data for sample 35
gene expression data for sample 36
gene expression data for sample 37
gene expression data for sample 38
gene expression data for sample 28
gene expression data for sample 29
gene expression data for sample 30
gene expression data for sample 31
gene expression data for sample 32
gene expression data for sample 33
gene expression data for sample 39
gene expression data for sample 40
gene expression data for sample 42
gene expression data for sample 47
gene expression data for sample 48
gene expression data for sample 49
gene expression data for sample 41
gene expression data for sample 43
gene expression data for sample 44
gene expression data for sample 45
gene expression data for sample 46
gene expression data for sample 70
gene expression data for sample 71
gene expression data for sample 72
gene expression data for sample 68
gene expression data for sample 69
gene expression data for sample 67
gene expression data for sample 55
gene expression data for sample 56
gene expression data for sample 59
gene expression data for sample 52
gene expression data for sample 53
gene expression data for sample 51
gene expression data for sample 50
gene expression data for sample 54
gene expression data for sample 57
gene expression data for sample 58
gene expression data for sample 60
gene expression data for sample 61
gene expression data for sample 65
gene expression data for sample 66
gene expression data for sample 63
gene expression data for sample 64
gene expression data for sample 62
The classes are in a separate numeric vector Golub.grp
with values
1
for the 38 ALL B-Cell samples, 2
for the 9 ALL T-Cell
samples and 3
for the 25 AML samples.
The original data of Golub et al. (1999) were preprocessed as follows: genes that were called 'absent' in all samples were removed from the data sets, since these measurements are considered unreliable by the manufacturer of the technology. Negative measurements in the data were set to 1.
The resulting data frame contains 5327 genes of the 6817 originally reported by Golub et al. (1999).
Luc Wouters et al. (2003). Graphical Exploration of Gene Expression Data: A Comparative Study of Three Multivariate Methods, Biometrics, 59, 1131-1139.