Usage
"ihw"(pvalues, covariates, alpha, covariate_type = "ordinal", nbins = "auto", m_groups = NULL, quiet = TRUE, nfolds = 5L, nfolds_internal = 5L, nsplits_internal = 1L, lambdas = "auto", seed = 1L, distrib_estimator = "grenander", lp_solver = "lpsymphony", adjustment_type = "BH", return_internal = FALSE, ...)
"ihw"(formula, data = parent.frame(), ...)
Arguments
pvalues
Numeric vector of unadjusted p-values.
covariates
Vector which contains the one-dimensional covariates (independent under the H0 of the p-value)
for each test. Can be numeric or a factor. (If numeric it will be converted into factor by binning.)
alpha
Numeric, sets the nominal level for FDR control.
covariate_type
"ordinal" or "nominal" (i.e. whether covariates can be sorted in increasing order or not)
nbins
Integer, number of groups into which p-values will be split based on covariate. Use "auto" for
automatic selection of the number of bins. Only applicable when covariates is not a factor.
m_groups
Integer vector of length equal to the number of levels of the covariates (only to be specified
when the latter is a factor/categorical). Each entry corresponds to the number of hypotheses to be tested in
each group (stratum). This argument needs to be given when the complete vector of p-values is
not available, but only p-values below a given threshold, for example because of memory reasons.
See the vignette for additional details and an example of how this principle can be applied with
numerical covariates.
quiet
Boolean, if False a lot of messages are printed during the fitting stages.
nfolds
Number of folds into which the p-values will be split for the pre-validation procedure
nfolds_internal
Within each fold, a second (nested) layer of cross-validation can be conducted to choose a good
regularization parameter. This parameter controls the number of nested folds.
nsplits_internal
Integer, how many times to repeat the nfolds_internal splitting. Can lead to better regularization
parameter selection but makes ihw a lot slower.
lambdas
Numeric vector which defines the grid of possible regularization parameters.
Use "auto" for automatic selection.
seed
Integer or NULL. Split of hypotheses into folds is done randomly. To have output of the function be reproducible,
we set a seed. Use NULL if you don't want a seed.
distrib_estimator
Character ("grenander" or "ECDF"). Only use this if you know what you are doing. ECDF with nfolds > 1
or lp_solver == "lpsymphony" will in general be excessively slow, except for very small problems.
lp_solver
Character ("lpsymphony" or "gurobi"). Internally, IHW solves a sequence of linear programs, which
can be solved with either of these solvers.
adjustment_type
Character ("BH" or "bonferroni") depending on whether you want to control FDR or FWER.
return_internal
Returns a lower level representation of the output (only useful for debugging purposes).
...
Arguments passed to internal functions.
formula
formula
, specified in the form pvalue~covariate (only 1D covariate supported)data
data.frame from which the variables in formula should be taken