cv.thresPPP: CV for Bayesian Estimation of a Sparse Covariance Matrix

Description

Performs cross-validation to estimate spectral norm error for a post-processed posterior of a sparse covariance matrix.

Usage

cv.thresPPP(
  X,
  thresvec,
  epsvec,
  prior = NULL,
  thresfun = "hard",
  nsample = 2000,
  ncores = 2
)

Value

CVdf: a M $\times$ 3 dataframe having the estimated spectral norm error for each thres and eps, where M = length(thresvec) * length(epsvec)

Arguments

X: a n $\times$ p data matrix with column mean zero.
thresvec: a vector of real numbers specifying the parameter of the threshold function.
epsvec: a vector of small positive numbers decreasing to $0$.
prior: a list giving the prior information. The list includes the following parameters (with default values in parentheses): A (I) giving the positive definite scale matrix for the inverse-Wishart prior, nu (p + k) giving the degree of freedom of the inverse-Wishar prior.
thresfun: a string to specify the type of threshold function. fun ('hard') giving the thresholding function ('hard' or 'soft') for the thresholding PPP procedure.
nsample: a scalar value giving the number of the post-processed posterior samples.
ncores: a scalar value giving the number of CPU cores.

Author

Kwangmin Lee

Details

Given a set of train data and validation data, the spectral norm error for each $\gamma$ and $\epsilon_n$ is estimated as follows: $$ ||\hat{\Sigma}(\gamma,\epsilon_n)^{(train)} - S^{(val)}||_2 $$ where $\hat{\Sigma}(\gamma,\epsilon_n)^{(train)}$ is the estimate for the covariance based on the train data and $S^{(val)}$ is the sample covariance matrix based on the validation data. The spectral norm error is estimated by the $10$-fold cross-validation. For more details, see the first paragraph on page 9 in Lee and Lee (2023).

References

Lee, K. and Lee, J. (2023), "Post-processes posteriors for sparse covariances", Journal of Econometrics, 236(3), 105475.

Examples

Run this code


# \donttest{
Sigma0 <- diag(1,50)
X <- mvtnorm::rmvnorm(25,sigma = Sigma0)
thresvec <- c(0.01,0.1)
epsvec <- c(0.01,0.1)
res <- bspcov::cv.thresPPP(X,thresvec,epsvec,nsample=100)
plot(res)# }
# \dontshow{
# R CMD check: make sure any open connections are closed afterward
if (!inherits(future::plan(), "sequential")) future::plan(future::sequential)
# }

Run the code above in your browser using DataLab