Learn R Programming

st (version 1.2.7)

regularizedt: Various (Regularized) t Statistics

Description

These functions provide a simple interface to a variety of (regularized) t statistics that are commonly used in the analysis of high-dimensional case-control studies.

Usage

efront.stat(X, L, verbose=TRUE)
efront.fun(L, verbose=TRUE)
sam.stat(X, L)
sam.fun(L)
samL1.stat(X, L, method=c("lowess", "cor"), plot=FALSE, verbose=TRUE)
samL1.fun(L, method=c("lowess", "cor"), plot=FALSE, verbose=TRUE)
modt.stat(X, L)
modt.fun(L)

Arguments

X

data matrix. Note that the columns correspond to variables (``genes'') and the rows to samples.

L

factor containing class labels for the two groups.

method

determines how the smoothing parameter is estimated (applies only to improved SAM statistic samL1).

plot

output diagnostic plot (applies only to improved SAM statistic samL1).

verbose

print out some (more or less useful) information during computation.

Value

The *.stat functions directly return the respective statistic for each variable.

The corresponding *.fun functions return a function that produces the respective statistics when applied to a data matrix (this is very useful for simulations).

Details

efront.* computes the t statistic using the 90 % rule of Efron et al. (2001).

sam.* computes the SAM t statistic of Tusher et al. (2001). Note that this requires the additional installation of the ``samr'' package.

samL1.* computes the improved SAM t statistic of Wu (2005). Note that part of the code in this function is based on the R code providec by B. Wu.

modt.* computes the moderated t statistic of Smyth (2004). Note that this requires the additional installation of the ``limma'' package.

All the above statistics are compared relative to each other and relative to the shrinkage t statistic in Opgen-Rhein and Strimmer (2007).

References

Opgen-Rhein, R., and K. Strimmer. 2007. Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Statist. Appl. Genet. Mol. Biol. 6:9. <DOI:10.2202/1544-6115.1252>

See Also

diffmean.stat, studentt.stat, shrinkt.stat, shrinkcat.stat.

Examples

Run this code
# NOT RUN {
# load st library 
library("st")

# load Choe et al. (2005) data
data(choedata)
X <- choe2.mat
dim(X) # 6 11475  
L <- choe2.L
L

# L may also contain some real labels
L = c("group 1", "group 1", "group 1", "group 2", "group 2", "group 2")


# Efron t statistic (90 % rule)
score = efront.stat(X, L)
order(score^2, decreasing=TRUE)[1:10]
# [1]  4790 10979 11068  1022    50   724  5762    43 10936  9939

# sam statistic
# (requires "samr" package)
#score = sam.stat(X, L)
#order(score^2, decreasing=TRUE)[1:10]
#[1]  4790 10979  1022  5762    35   970    50 11068 10905  2693

# improved sam statistic
#score = samL1.stat(X, L)
#order(score^2, decreasing=TRUE)[1:10]
#[1]  1  2  3  4  5  6  7  8  9 10
# here all scores are zero!

# moderated t statistic
# (requires "limma" package)
#score = modt.stat(X, L)
#order(score^2, decreasing=TRUE)[1:10]
# [1]  4790 10979  1022  5762    35    50 11068   970 10905    43

# shrinkage t statistic
score = shrinkt.stat(X, L)
order(score^2, decreasing=TRUE)[1:10]
#[1] 10979 11068    50  1022   724  5762    43  4790 10936  9939
# }

Run the code above in your browser using DataLab