mglasso: Graphical Lasso Estimator with Missing-at-Random Data

Description

‘mglasso’ function is used to fit an l1-penalized Gaussian graphical model with missing-at-random data.

Usage

mglasso(X, weights, pendiag = FALSE, nrho = 50L, rho.min.ratio, rho,
        maxR2, maxit_em = 1.0e+3, thr_em = 1.0e-4, maxit_bcd = 1.0e+4,
        thr_bcd = 1.0e-4, trace = 0L)

Arguments

the $(n\times p)$-dimensional matrix used to fit the model.

weights

an optional symmetric matrix of non-negative weights. This matrix can be used to specify the unpenalized partial correlation coefficients (‘weights[i, j] = 0’) or the structural zeros in the precision matrix (‘weights[i, j] = +Inf’). See below for an example. By default, mglasso model is fitted without weights.

pendiag

flag used to specify if the diagonal elements of the concentration matrix are penalized (‘pendiag = TRUE’) or unpenalized (‘pendiag = FALSE’).

nrho

the integer specifying the number of tuning parameters used to fit the mglasso model. Default is ‘nrho = 50’.

rho.min.ratio

the smallest value for the tuning parameter $\rho$, as a fraction of the smallest tuning parameter for which all the estimated partial correlation coefficients are zero. The default depends on the sample size ‘$n$’ relative to the number of variables ‘$p$’. If ‘$p < n$’, the default is ‘1.0E-4’ otherwise the value ‘1.0E-2’ is used as default. A very small value of ‘rho.min.ratio’ will lead to a saturated fitted model in the ‘$p < n$’ case.

rho

optional argument. A user supplied rho sequence. WARNING: avoid supplying a single value for the tuning parameter; supply instead a decreasing sequence of $\rho$-values.

maxR2

a value belonging to the interval $[0, 1]$ specifying the largest value of the pseudo R-squared measure (see Section Details). The regularization path is stopped when $R^2$ exceeds ‘maxR2’. Default depends on the sample size ‘$n$’ relative to the number of variables ‘$p$’. If ‘$p < n$’, the default is ‘1’ otherwise the value ‘0.9’ is used as default.

maxit_em

maximum number of iterations of the EM algorithm. Default is 1.0E+3.

thr_em

threshold for the convergence of the EM algorithm. Default value is 1.0E-4.

maxit_bcd

maximum number of iterations of the glasso algorithm. Default is 1.0E+4.

thr_bcd

threshold for the convergence of the glasso algorithm. Default is 1.0E-4.

trace

integer for printing out information as iterations proceed: trace = 0 no information is printed out on video; trace = 1 basic information is printed out on video; trace = 2 detailed information is printed out on video.

Value

mglasso returns an object with S3 class “mglasso”, i.e., a list containing the following components:

call

the call that produced this object.

the original matrix used to fit the missglasso model.

weights

the weights used to fit the missglasso model.

pendiag

the flag specifying if the diagonal elements of the precisione matrix are penalized.

nrho

the number of fitted missglasso model.

rho.min.ratio

the scale factor used to compute the smallest value of the tuning parameter.

rho

the $p$-dimensional vector reporting the values of the tuning parameter used to fit the missglasso model.

maxR2

the threshold value used to stop the regularization path.

maxit_em

the maximum number of iterations of the EM algorithm.

thr_em

the threshold for the convergence of the EM algorithm.

maxit_bcd

the maximum number of iterations of the glasso algorithm.

thr_bcd

the threshold for the convergence of the glasso algorithm.

Xipt

an array of dimension $n\times p\times\texttt{nrho}$. Xipt[, , k] is the matrix where the missing values are replaced with the conditional expected values computed in the E-step of the algorithm describted in section Details.

an array of dimension $p\times p\times\texttt{nrho}$. S[, , k] is the matrix $S^\rho$ used to fit the glasso model in the M-step of the algorithm describted in section Details.

a matrix of dimension $p\times\texttt{nrho}$. The $k$th column is the estimate of the expected values of the missglasso model fitted using rho[k].

Sgm

an array of dimension $p\times p\times\texttt{nrho}$. Sgm[, , k] is the estimate of the covariance matrix of the missglasso model fitted using rho[k].

Tht

an array of dimension $p\times p\times\texttt{nrho}$. Tht[, , k] is the estimate of the precision matrix of the missglasso model fitted using rho[k].

Adj

an array of dimension $p\times p\times\texttt{nrho}$. Adj[, , k] is the adjacency matrix associated to Tht[, , k], i.e. Adj[i, j, k] $= 1$ iff Tht[i, j, k] $\neq 0$ and 0 otherwise.

the $nrho$-dimensional vector reporting the number of non-zero partial correlation coefficients.

the $nrho$-dimensional vector reporting the values of the measure $R^2$ described in section Details.

ncomp

the $nrho$-dimensional vector reporting the number of connected components (for internal purposes only).

the $(p\times\texttt{nrho})$-dimensional matrix encoding the connected components (for internal purposes only).

the $(p\times\texttt{nrho})$-dimensional matrix reporting the number of vertices per connected component (for internal purposes only).

nit

the $(\texttt{nrho}\times 2)$-dimensional matrix reporting the number of iterations.

conv

a description of the error that has occurred.

subrout

the name of the Fortran subroutine where the error has occurred (for internal debug only).

trace

the integer used for printing out information.

Details

The missglasso estimator (Stadler and other, 2012) is an extension of the classical graphical lasso (glasso) estimator (Yuan and other, 2007) developed to fit a sparse Gaussian graphical model under the assumption that data are missing-at-random.

mglasso function fits the model using the following EM algorithm:

Step	Description
1.	Let $\{\hat{\mu}^\rho_{ini};\hat{\Theta}^\rho_{ini}\}$ be initial estimates;
2.	E-step
	use the expected values of the conditional normal distribution to impute the missing data
	let $X^{\rho}$ the completed data and $S^{\rho}$ the corresponding empirical covariance matrix
3.	M-step
	let $\hat{\mu}_h^\rho = \sum_{i=1}^n x_{ih}^\rho$;
	compute $\hat{\Theta}^\rho$ using $S^{\rho}$ and the glasso algorithm (Friedman and other, 2008);

In order to avoid the overfitting of the model, we use the following pseudo R-squared measure: $$R^2 = 1 - \frac{\|S^\rho - \hat{\Sigma}^\rho\|_F}{\|S^{\rho_{\max}} - \hat{\Sigma}^{\rho_{\max}}\|_F},$$ where $\|\cdot\|_F$ denotes the Frobenius norm and $\rho_{\max}$ denotes the smallest value of the tuning parameter for which all the estimated partial correlation coefficients are zero. By straightforward algebra, it is easy to show that the proposed pseudo R-squared belongs to the closed interval $[0, 1]$: $R^2 = 0$ when the tuning parameter is equal to $\rho_{\max}$ and $R^2 = 1$ when $\rho = 0$. The regularization path is stopped when $R^2$ exceeds the threshold specify by ‘maxR2’.

References

Friedman, J.H., Hastie, T., and Tibshirani, R. (2008) <DOI:10.1093/biostatistics/kxm045>. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432--441.

Stadler N., and Buhlmann P. (2012) <DOI:10.1007/s11222-010-9219-7>. Missing values: sparse inverse covariance estimation and an extension to sparse regression. Statistics and Computing 22, 219--235.

Yuan, M., and Lin, Y. (2007) <DOI:10.1093/biomet/asm018>. Model selection and estimation in the Gaussian graphical model. Biometrika 94, 19--35.

Examples

Run this code

# NOT RUN {
library("cglasso")
set.seed(123)

p <- 5L
n <- 100L
mu <- rep(0L, p)
Tht <- diag(p)
diag(Tht[-1L, -p]) <- diag(Tht[-p, -1L]) <- 0.3
Sgm <- solve(Tht)
X <- MASS::mvrnorm(n = n, mu = mu, Sigma = Sgm)
X <- as.vector(X)
id.na <- sample.int(n = n * p, size = n * p * 0.05)
X[id.na] <- NA
dim(X) <- c(n, p)
out <- mglasso(X = X)
out

# in this example we use the argument 'weights' to specify
# the unpenalized partial correlation coefficients and the
# structural zeros in the precision matrix

w <- rep(1, p * p)
dim(w) <- c(p, p)

# specifing the unpenalized partial correlation coefficients
diag(w) <- diag(w[-1L, -p]) <- diag(w[-p, -1L]) <- 0

# specifing the structural zero
w[1L, 4L:5L] <- w[4L:5L, 1L] <- +Inf
w[2L, 5L] <- w[5L, 2L] <- +Inf
w

out <- mglasso(X = X, weights = w)

# checking structural zeros
out$Tht[, , out$nrho][w == +Inf]

# checking stationarity conditions of the MLE estimators
# (the unpenalized partial correlation coefficients)
(out$Sgm[, , out$nrho] - out$S[, , out$nrho])[w == 0]
# }

Run the code above in your browser using DataLab

Step	Description
1.	Let \(\{\hat{\mu}^\rho_{ini};\hat{\Theta}^\rho_{ini}\}\) be initial estimates;
2.	E-step
	use the expected values of the conditional normal distribution to impute the missing data
	let \(X^{\rho}\) the completed data and \(S^{\rho}\) the corresponding empirical covariance matrix
3.	M-step
	let \(\hat{\mu}_h^\rho = \sum_{i=1}^n x_{ih}^\rho\);
	compute \(\hat{\Theta}^\rho\) using \(S^{\rho}\) and the glasso algorithm (Friedman and other, 2008);