Learn R Programming

novelist (version 1.0)

novelist.inv.cv: Optimal NOVELIST estimator of the inverse of a covariance/correlation matrix

Description

Optimal NOVELIST estimator of the inverse of a covariance/correlation matrix.

Usage

novelist.inv.cv(x, data = TRUE, is.cov =TRUE, lambda = seq(0, 1, by = 0.05), delta =  NULL, CV = TRUE, CV.inv.cov = TRUE, Th.method = softt, rep.cv = 50)

Arguments

x
a $n$ by $p$ data matrix or $p$ by $p$ sample correlation/covariance matrix, where $p$ is dimension and $n$ is sample size.
data
x is the data matrix if data=TRUE; x is the sample covariance/correlation matrix if data=FALSE.
is.cov
only valid when data=FALSE. x is the sample covariance matrix if is.cov=TRUE; x is the sample correlation matrix if is.cov=FALSE.
lambda
a series of thresholding levels, for example lambda=seq(0,1,by=0.05).
delta
a series of shrinkage intensities, for example delta=seq(-0.5,1.5,by=0.05).
CV
empirical optimal parameters ($\lambda*$,$\delta*$) are chosen from an semi-analytical method if CV=TRUE, where only lambda should be given; assigned parameters are used if CV=FALSE, where both lambda and delta should be given.
CV.inv.cov
only valid when CV=TRUE. Minimizing the spectral norm error of the inverse of the covariance matrix estimator if CV.inv.cov=TRUE; minimizing the spectral norm error of the inverse of the correlation matrix estimator if CV.inv.cov=FALSE.
Th.method
thresholding method. Soft thresholding is used if Th.method=softt, hard thresholding is used if Th.method=hardt or any other generalized thresholding method chosen by users.
rep.cv
repetition times for cross validation.

Value

inv.cov.novel
NOVELIST estimator of the inverse of the covariance matrix.
inv.cor.novel
NOVELIST estimator of the inverse of the correlation matrix.
lambda.star
empirical choice of thresholding.
delta.star
empirical choice of shrinkage intensity.
If data=TRUE, both inv.cov.novel and inv.cor.novel are computed.If data=FALSE and is.cov=TRUE, both inv.cov.novel and inv.cor.novel are computed.If data=FALSE and is.cov=FALSE, only inv.cor.novel is computed.lambda.star and delta.star are only computed when CV=TRUE.

Details

First, NOVELIST performs shrinkage of the sample correlation matrix towards a thresholding target, yielding the NOVELIST correlation matrix estimator, which has 2 parameters. The NOVELIST correlation estimator can be obtained by either assigned parameters or empirical optimal parameters which are automatically chosen from a semi-analytical method. The method combines Ledoit-Wolf's lemma (Ledoit and Wolf, 2003) and cross validation. First, for each thresholding level $\lambda$, it applies Ledoit-Wolf's method to choose the optimal empirical shrinkage intensity $\delta*(\lambda)$, and then pick out the best pair of $(\lambda, \delta*(\lambda))$ which reaches the minimal spectral norm error by cross validation, and denote it as $(\lambda*, \delta*)$, then the final NOVELIST correlation estimator is calculated using $(\lambda*, \delta*)$. This process significantly accelerates the computing speed. Second, the method obtains the corresponding NOVELIST covariance matrix estimator by applying the sample variances to the NOVELIST correlation matrix estimator. The details are explained in Huang and Fryzlewicz (2015).

References

Huang, N. & Fryzlewicz, P. (2015), "NOVELIST estimator of large correlation and covariance matrices and their inverses". Preprint..

Ledoit, O., & Wolf, M.(2003), "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection." Journal of Empirical Finance 10, 603-621.

See Also

novelist.cov.cv

Examples

Run this code
# arbitrary positive definite covariance matrix

cov.nonsparse<- function(p, rgen = rnorm, rgen.diag = rchisq, df = 5)  
  
{
  
  m <- matrix(rgen(p*p), p, p)
  
  m[upper.tri(m)] <- 0
  
  tmp <- m %*% t(m)
  
  tmp.svd <- svd(tmp)
  
  tmp.svd$d <- rgen.diag(p, df)
  
  tmp.svd$u %*% diag(tmp.svd$d) %*% t(tmp.svd$u)
  
}

# simulate  n x p data matrix by a given covariance matrix

sim.data <- function(sc, n, rgen = rnorm)  

{
 
  l <- t(chol(sc))
  
  p <- dim(sc)[1]
  
  z <- matrix(rgen(p * n), p, n)

  t(l %*% z)
  
}

p=30

n=30 

cov<-cov.nonsparse(p)

x<-sim.data(cov,n)

# input n x p data matrix and assign parameters

novelist.inv.cv(x,lambda=c(0,0.5),delta=c(0,0.5,1),CV=FALSE) 

# input n x p data matrix and find optimal parameters for covariance estimator 
# achieves the least error to the true covariance matirx via cross validation

novelist.inv.cv(x, lambda=seq(0,1,by=0.1)) 

# input n x p data matrix and find optimal parameters for correlation estimator 
# achieves the least error to the true correlation matirx via cross validation

novelist.inv.cv(x, lambda=seq(0,1,by=0.1), CV.inv.cov=FALSE)  

# input covariance matrix and assign parameters

novelist.inv.cv(cov(x),data=FALSE,lambda=c(0,0.5),delta=c(0,0.5,1),CV=FALSE) 

# input correlation matrix and assign parameters

novelist.inv.cv(cor(x),data=FALSE,is.cov=FALSE,lambda=c(0,0.5),delta=c(0,0.5,1),CV=FALSE) 

Run the code above in your browser using DataLab