sam(data, cl, method = d.stat, control=samControl(), gene.names = dimnames(data)[[1]], ...)
data
(or exprs(data)
, respectively) must correspond to a variable (e.g., a gene), and
each column to a sample (i.e.\ an observation).
Can also be a list (if method = chisq.stat
or
method = trend.stat
). For details on how to specify data in this case,
see chisq.stat
.ncol(data)
containing the class
labels of the samples. In the two class paired case, cl
can also
be a matrix with ncol(data)
rows and 2 columns. If data
is
an ExpressionSet object, cl
can also be a character string naming the column
of pData(data)
that contains the class labels of the samples. If data
is a list, cl
needs not to be specified.
In the one-class case, cl
should be a vector of 1's.
In the two class unpaired case, cl
should be a vector containing 0's
(specifying the samples of, e.g., the control group) and 1's (specifying,
e.g., the case group).
In the two class paired case, cl
can be either a numeric vector or a numeric matrix.
If it is a vector, then cl
has to consist of the integers between -1 and
$-n/2$ (e.g., before treatment group) and between 1 and $n/2$ (e.g.,
after treatment group), where $n$ is the length of cl
and $k$
is paired with $-k$, $k=1,\dots,n/2$. If cl
is a matrix, one
column should contain -1's and 1's specifying, e.g., the before and the after
treatment samples, respectively, and the other column should contain integer
between 1 and $n/2$ specifying the $n/2$ pairs of observations.
In the multiclass case and if method = chisq.stat
, cl
should be a vector containing integers
between 1 and $g$, where $g$ is the number of groups. (In the case of chisq.stat
,
cl
needs not to be specified if data
is a list of groupwise matrices.)
For examples of how cl
can be specified, see the manual of siggenes.method = d.stat
,
a modified t-statistic or F-statistic, respectively, will be computed
as proposed by Tusher et al. (2001).
If method = wilc.stat
, a
Wilcoxon rank sum statistic or Wilcoxon signed rank statistic will be used
as expression score.
For an analysis of categorical data such as SNP data,
method
can be set to chisq.stat
. In this case Pearson's
ChiSquare statistic is computed for each row.
If the variables are ordinal and a trend test should be applied
(e.g., in the two-class case, the Cochran-Armitage trend test), method = trend.stat
can be employed.
It is also possible to use
an user-written function to compute the expression scores.
For details, see Details
.samControl
.nrow(data)
containing the
names of the genes. By default the row names of data
are used.method = d.stat
,
see the help of d.stat
. If method = wilc.stat
, see the help
of wilc.stat
. If method = chisq.stat
, see the help of
chisq.stat
.sam
provides SAM procedures for several types of analysis (one and two class analyses
with either a modified t-statistic or a Wilcoxon rank statistic, a multiclass analysis
with a modified F statistic, and an analysis of categorical data). It is, however, also
possible to write your own function for another type of analysis. The required arguments
of this function must be data
and cl
. This function can also have other
arguments. The output of this function must be a list containing the following objects:
d
:
d.bar
:na.exclude(d)
specifying
the expected expression scores under the null hypothesis.
p.value
:d
containing
the raw, unadjusted p-values of the genes.
vec.false
:d
consisting of
the one-sided numbers of falsely called genes, i.e. if $d > 0$ the numbers
of genes expected to be larger than $d$ under the null hypothesis, and if
$d<0$, the="" number="" of="" genes="" expected="" to="" be="" smaller="" than="" $d$="" under="" null="" hypothesis.<="" dd="">0$,>
s
:d
containing the standard deviations
of the genes. If no standard deviation can be calculated, set s = numeric(0)
.
s0
:s0 = numeric(0)
.
mat.samp
:ncol(data)
columns, where B is the number
of permutations, containing the permutations used in the computation of the permuted
d-values. If such a matrix is not computed, set mat.samp = matrix(numeric(0))
.
msg
:msg
is printed when the function print
or
summary
, respectively, is called. If no such message should be printed, set msg = ""
.
fold
:d
consisting of the fold
changes of the genes. If no fold change has been computed, set fold = numeric(0)
.
If this function is, e.g., called foo
, it can be used by setting method = foo
in sam
. More detailed information and an example will be contained in the siggenes
manual.
Schwender, H. (2004). Modifying Microarray Analysis Methods for Categorical Data -- SAM and PAM for SNPs. To appear in: Proceedings of the the 28th Annual Conference of the GfKl.
Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. PNAS, 98, 5116-5121.
SAM-class
,d.stat
,wilc.stat
,
chisq.stat
, samControl