Learn R Programming

mpm (version 1.0-23)

Golub: Golub (1999) Data

Description

Golub et al. (1999) data on gene expression profiles of 38 patients suffering from acute leukemia and a validation sample of 34 patients.

Arguments

Format

The expression data are available in data frame Golub with 5327 observations on the following 73 variables.

list("Gene")

a character vector with gene identifiers

list("1")

gene expression data for sample 1

list("2")

gene expression data for sample 2

list("3")

gene expression data for sample 3

list("4")

gene expression data for sample 4

list("5")

gene expression data for sample 5

list("6")

gene expression data for sample 6

list("7")

gene expression data for sample 7

list("8")

gene expression data for sample 8

list("9")

gene expression data for sample 9

list("10")

gene expression data for sample 10

list("11")

gene expression data for sample 11

list("12")

gene expression data for sample 12

list("13")

gene expression data for sample 13

list("14")

gene expression data for sample 14

list("15")

gene expression data for sample 15

list("16")

gene expression data for sample 16

list("17")

gene expression data for sample 17

list("18")

gene expression data for sample 18

list("19")

gene expression data for sample 19

list("20")

gene expression data for sample 20

list("21")

gene expression data for sample 21

list("22")

gene expression data for sample 22

list("23")

gene expression data for sample 23

list("24")

gene expression data for sample 24

list("25")

gene expression data for sample 25

list("26")

gene expression data for sample 26

list("27")

gene expression data for sample 27

list("34")

gene expression data for sample 34

list("35")

gene expression data for sample 35

list("36")

gene expression data for sample 36

list("37")

gene expression data for sample 37

list("38")

gene expression data for sample 38

list("28")

gene expression data for sample 28

list("29")

gene expression data for sample 29

list("30")

gene expression data for sample 30

list("31")

gene expression data for sample 31

list("32")

gene expression data for sample 32

list("33")

gene expression data for sample 33

list("39")

gene expression data for sample 39

list("40")

gene expression data for sample 40

list("42")

gene expression data for sample 42

list("47")

gene expression data for sample 47

list("48")

gene expression data for sample 48

list("49")

gene expression data for sample 49

list("41")

gene expression data for sample 41

list("43")

gene expression data for sample 43

list("44")

gene expression data for sample 44

list("45")

gene expression data for sample 45

list("46")

gene expression data for sample 46

list("70")

gene expression data for sample 70

list("71")

gene expression data for sample 71

list("72")

gene expression data for sample 72

list("68")

gene expression data for sample 68

list("69")

gene expression data for sample 69

list("67")

gene expression data for sample 67

list("55")

gene expression data for sample 55

list("56")

gene expression data for sample 56

list("59")

gene expression data for sample 59

list("52")

gene expression data for sample 52

list("53")

gene expression data for sample 53

list("51")

gene expression data for sample 51

list("50")

gene expression data for sample 50

list("54")

gene expression data for sample 54

list("57")

gene expression data for sample 57

list("58")

gene expression data for sample 58

list("60")

gene expression data for sample 60

list("61")

gene expression data for sample 61

list("65")

gene expression data for sample 65

list("66")

gene expression data for sample 66

list("63")

gene expression data for sample 63

list("64")

gene expression data for sample 64

list("62")

gene expression data for sample 62

The classes are in a separate numeric vector Golub.grp with values 1 for the 38 ALL B-Cell samples, 2 for the 9 ALL T-Cell samples and 3 for the 25 AML samples.

Details

The original data of Golub et al. (1999) were preprocessed as follows: genes that were called 'absent' in all samples were removed from the data sets, since these measurements are considered unreliable by the manufacturer of the technology. Negative measurements in the data were set to 1.

The resulting data frame contains 5327 genes of the 6817 originally reported by Golub et al. (1999).

References

Luc Wouters et al. (2003). Graphical Exploration of Gene Expression Data: A Comparative Study of Three Multivariate Methods, Biometrics, 59, 1131-1139.