wilc.ebam: EBAM Analysis Using Wilcoxon Rank Statistics

Description

Generates the required statistics for an Empirical Bayes Analysis of Microarrays analysis using standardized Wilcoxon rank statistics. Should not be called directly, but via ebam(..., method = wilc.ebam).

Usage

wilc.ebam(data, cl, approx50 = TRUE, ties.method = c("min", "random",  "max"), use.offset = TRUE, df.glm = 5, use.row = FALSE, rand = NA)

Arguments

data

a matrix or a data frame. Each row of data must correspond to a variable (e.g., a gene), and each column to a sample (i.e.\ an observation).

a numeric vector of length ncol(data) containing the class labels of the samples. In the two class paired case, cl can also be a matrix with ncol(data) rows and 2 columns. For details on how cl should be specified, see ebam.

approx50

if TRUE, the null distribution will be approximated by the standard normal distribution. Otherwise, the exact null distribution is computed. This argument will automatically be set to FALSE if there are less than 50 samples in each of the groups.

ties.method

either "min" (default), "random", or "max". If "random", the ranks of ties are randomly assigned. If "min" or "max", the ranks of ties are set to the minimum or maximum rank, respectively. For details, see the help of rank. If use.row = TRUE, then ties.method = "max" is used. For the handling of Zeros, see Details.

use.offset

should an offset be used in the Poisson regression employed to estimate the density of the observed Wilcoxon rank sums? If TRUE, the log-transformed values of the null density is used as offset.

df.glm

integer specifying the degrees of freedom of the natural cubic spline employed in the Poisson regression.

use.row

if TRUE, rowWilcoxon is used to compute the Wilcoxon rank statistics.

rand

numeric value. If specified, i.e. not NA, the random number generator will be set into a reproducible state.

Value

A list of statistics required by ebam.

Details

Standardized versions of the Wilcoxon rank statistics are computed. This means that $W* = (W - mean(W)) / sd(W)$ is used as expression score $z$, where $W$ is the usual Wilcoxon rank sum statistic or Wilcoxon signed rank statistic, respectively. In the computation of these statistics, the ranks of ties are by default set to the minimum rank. In the computation of the Wilcoxon signed rank statistic, zeros are randomly set either to a very small positive or negative value. If there are less than 50 observations in each of the groups, the exact null distribution will be used. If there are more than 50 observations in at least one group, the null distribution will by default be approximated by the standard normal distribution. It is, however, still possible to compute the exact null distribution by setting approx50 to FALSE.

References

Efron, B., Storey, J.D., Tibshirani, R.\ (2001). Microarrays, empirical Bayes methods, and the false discovery rate, Technical Report, Department of Statistics, Stanford University.