farm.test: Factor-adjusted robust multiple testing

Description

This function conducts factor-adjusted robust multiple testing (FarmTest) for means of multivariate data proposed in Fan et al. (2019) via a tuning-free procedure.

Usage

farm.test(
  X,
  fX = NULL,
  KX = -1,
  Y = NULL,
  fY = NULL,
  KY = -1,
  h0 = NULL,
  alternative = c("two.sided", "less", "greater"),
  alpha = 0.05,
  p.method = c("bootstrap", "normal"),
  nBoot = 500
)

Arguments

An \(n\) by \(p\) data matrix with each row being a sample.

An optional factor matrix with each column being a factor for X. The number of rows of fX and X must be the same.

An optional positive number of factors to be estimated for X when fX is not specified. KX cannot exceed the number of columns of X. If KX is not specified or specified to be negative, it will be estimated internally. If KX is specified to be 0, no factor will be adjusted.

An optional data matrix used for two-sample FarmTest. The number of columns of X and Y must be the same.

An optional factor matrix for two-sample FarmTest with each column being a factor for Y. The number of rows of fY and Y must be the same.

An optional positive number of factors to be estimated for Y for two-sample FarmTest when fY is not specified. KY cannot exceed the number of columns of Y. If KY is not specified or specified to be negative, it will be estimated internally. If KY is specified to be 0, no factor will be adjusted.

An optional \(p\)-vector of true means, or difference in means for two-sample FarmTest. The default is a zero vector.

alternative

An optional character string specifying the alternate hypothesis, must be one of "two.sided" (default), "less" or "greater".

alpha

An optional level for controlling the false discovery rate. The value of alpha must be between 0 and 1. The default value is 0.05.

p.method

An optional character string specifying the method to calculate p-values when fX is known or when KX = 0, possible options are multiplier bootstrap or normal approximation. It must be one of "bootstrap" (default) or "normal".

nBoot

An optional positive integer specifying the size of bootstrap sample, only available when p.method = "bootstrap". The dafault value is 500.

Value

An object with S3 class farm.test containing the following items will be returned:

means Estimated means, a vector with length \(p\).
stdDev Estimated standard deviations, a vector with length \(p\). It's not available for bootstrap method.
loadings Estimated factor loadings, a matrix with dimension \(p\) by \(K\), where \(K\) is the number of factors.
eigenVal Eigenvalues of estimated covariance matrix, a vector with length \(p\). It's only available when factors fX and fY are not given.
eigenRatio Ratios of eigenVal to estimate nFactors, a vector with length \(min(n, p) / 2\). It's only available when number of factors KX and KY are not given.
nFactors Estimated or input number of factors, a positive integer.
tStat Values of test statistics, a vector with length \(p\). It's not available for bootstrap method.
pValues P-values of tests, a vector with length \(p\).
significant Boolean values indicating whether each test is significant, with 1 for significant and 0 for non-significant, a vector with length \(p\).
reject Indices of tests that are rejected. It will show "no hypotheses rejected" if none of the tests are rejected.
type Indicator of whether factor is known or unknown.
n Sample size.
p Data dimension.
h0 Null hypothesis, a vector with length \(p\).
alpha \(\alpha\) value.
alternative Althernative hypothesis.

Details

For two-sample FarmTest, means, stdDev, loadings, eigenVal, eigenRatio, nfactors and n will be lists of items for sample X and Y separately.

alternative = "greater" is the alternative that \(\mu > \mu_0\) for one-sample test or \(\mu_X > \mu_Y\) for two-sample test.

Setting p.method = "bootstrap" for factor-known model will slow down the program, but it will achieve lower empirical FDP than setting p.method = "normal".

References

Ahn, S. C. and Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81(3) 1203<U+2013>1227.

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B. Stat. Methodol., 57 289<U+2013>300.

Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc., to appear.

Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist., 35, 73<U+2013>101.

Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol., 64, 479<U+2013>498.

Sun, Q., Zhou, W.-X. and Fan, J. (2019). Adaptive Huber regression. J. Amer. Statist. Assoc., to appear.

Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2018). A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing. Ann. Statist., 46 1904-1931.

Examples

Run this code

# NOT RUN {
n = 20
p = 50
K = 3
muX = rep(0, p)
muX[1:5] = 2
set.seed(2019)
epsilonX = matrix(rnorm(p * n, 0, 1), nrow = n)
BX = matrix(runif(p * K, -2, 2), nrow = p)
fX = matrix(rnorm(K * n, 0, 1), nrow = n)
X = rep(1, n) %*% t(muX) + fX %*% t(BX) + epsilonX
# One-sample FarmTest with two sided alternative
output = farm.test(X)
# One-sample FarmTest with one sided alternative
output = farm.test(X, alternative = "less")
# One-sample FarmTest with known factors
output = farm.test(X, fX = fX)

# Two-sample FarmTest
muY = rep(0, p)
muY[1:5] = 4
epsilonY = matrix(rnorm(p * n, 0, 1), nrow = n)
BY = matrix(runif(p * K, -2, 2), nrow = p)
fY = matrix(rnorm(K * n, 0, 1), nrow = n)
Y = rep(1, n) %*% t(muY) + fY %*% t(BY) + epsilonY
output = farm.test(X, Y = Y)
# }

Run the code above in your browser using DataLab