Learn R Programming

limma (version 3.28.14)

squeezeVar: Squeeze Sample Variances

Description

Squeeze a set of sample variances together by computing empirical Bayes posterior means.

Usage

squeezeVar(var, df, covariate=NULL, robust=FALSE, winsor.tail.p=c(0.05,0.1))

Arguments

var
numeric vector of independent sample variances.
df
numeric vector of degrees of freedom for the sample variances.
covariate
if non-NULL, var.prior will depend on this numeric covariate. Otherwise, var.prior is constant.
robust
logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?
winsor.tail.p
numeric vector of length 1 or 2, giving left and right tail proportions of x to Winsorize. Used only when robust=TRUE.

Value

A list with components
var.post
numeric vector of posterior variances.
var.prior
location of prior distribution. A vector if covariate is non-NULL, otherwise a scalar.
df.prior
degrees of freedom of prior distribution. A vector if robust=TRUE, otherwise a scalar.

Details

This function implements an empirical Bayes algorithm proposed by Smyth (2004).

A conjugate Bayesian hierarchical model is assumed for a set of sample variances. The hyperparameters are estimated by fitting a scaled F-distribution to the sample variances. The function returns the posterior variances and the estimated hyperparameters.

Specifically, the sample variances var are assumed to follow scaled chi-squared distributions, conditional on the true variances, and an scaled inverse chi-squared prior is assumed for the true variances. The scale and degrees of freedom of this prior distribution are estimated from the values of var.

The effect of this function is to squeeze the variances towards a common value, or to a global trend if a covariate is provided. The squeezed variances have a smaller expected mean square error to the true variances than do the sample variances themselves.

If covariate is non-null, then the scale parameter of the prior distribution is assumed to depend on the covariate. If the covariate is average log-expression, then the effect is an intensity-dependent trend similar to that in Sartor et al (2006).

robust=TRUE implements the robust empirical Bayes procedure of Phipson et al (2016) which allows some of the var values to be outliers.

References

Phipson, B, Lee, S, Majewski, IJ, Alexander, WS, and Smyth, GK (2016). Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression. Annals of Applied Statistics 10. http://arxiv.org/abs/1602.08678

Sartor MA, Tomlinson CR, Wesselkamper SC, Sivaganesan S, Leikauf GD, Medvedovic M (2006). Intensity-based hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments. BMC bioinformatics 7, 538.

Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology, 3, No. 1, Article 3. http://www.statsci.org/smyth/pubs/ebayes.pdf

See Also

This function is called by ebayes.

This function calls fitFDist.

An overview of linear model functions in limma is given by 06.LinearModels.

Examples

Run this code
s2 <- rchisq(20,df=5)/5
squeezeVar(s2, df=5)

Run the code above in your browser using DataLab