The method proposed in this package takes into account the impact of dependence on multiple testing procedures for high-throughput data as proposed by Friguet et al. (2009). The common information shared by all the variables is modeled by a factor analysis structure. The number of factors considered in the model is chosen to reduce the variance of the number of false discoveries. The model parameters are estimated thanks to an EM algorithm. Factor-adjusted tests statistics are derived, as well as the associated p-values. The proportion of true null hypotheses (an important parameter when controlling the false discovery rate) is also estimated from the FAMT model. Diagnostic plots are proposed to interpret and describe the factors.
Package: | FAMT |
Type: | Package |
Version: | 1.0 |
Date: | 2010-05-03 |
License: | GPL |
LazyLoad: | yes |
The as.FAMTdata
function creates a single R object containing the data stored:
- in one mandatory data-frame: the 'expression' dataset with m rows (if m tests) and n columns (n is the sample size) containing the observations of the responses.
- and two optional data-frames: the 'covariates' dataset with n rows and at least 2 columns, one giving the specification to match 'expression' and 'covariates' and the other one containing the observations of at least one covariate. The optional dataset, 'annotations' can be provided to help interpreting the factors: with m rows and at least one column to identify the variables (ID).
The whole multiple testing procedure is provided in a single function, modelFAMT
, but you can also choose to apply the procedure step by step, using the functions :
nbfactors
(Estimation of the optimal number of factors)
emfa
(EM fitting of the Factor Analysis model).
The modelFAMT
also provides the individual test statistics and corresponding p-values like the raw.pvalues
function.
A function summaryFAMT
provides some key elements of classical summaries either on 'FAMTdata' or 'FAMTmodel'.
The estimation of the proportion of true null hypotheses from a 'FAMTmodel' is done by the function pi0FAMT
.
The defacto
function provides diagnostic plots to interpret and describe the factors.
Causeur D., Friguet C., Houee-Bigot M., Kloareg M. (2011). Factor Analysis for Multiple Testing (FAMT): An R Package for Large-Scale Significance Testing Under Dependence. Journal of Statistical Software, 40(14),1-19. https://www.jstatsoft.org/v40/i14
Friguet C., Kloareg M. and Causeur D. (2009). A factor model approach to multiple testing under dependence. Journal of the American Statistical Association, 104:488, p.1406-1415