GGMM: Learning high-dimensional Gaussian Graphical Models with Heterogeneous Data.

Description

Gaussian Graphical Mixture Models for learning a single high-dimensional network structure from heterogeneous dataset.

Usage

GGMM(data, A, M, alpha1 = 0.1, alpha2 = 0.05, alpha3 = 0.05, iteration = 30, warm = 20)

Arguments

data

\(n\)x\(p\) mixture Gaussian distributed dataset.

\(p\)x\(p\) true adjacency matrix for evaluating the performance.

The number of heterogeneous groups.

alpha1

The significance level of correlation screening in the \(\psi\)-learning algorithm, see R package equSA for detail. In general, a high significance level of correlation screening will lead to a slightly large separator set, which reduces the risk of missing important variables in the conditioning set. In general, including a few false variables in the conditioning set will not hurt much the accuracy of the \(\psi\)-partial correlation coefficient, the default value is 0.1.

alpha2

The significance level of \(\psi\)-partial correlation coefficient screening for estimating the adjacency matrix, see equSA, the default value is 0.05.

alpha3

The significance level of integrative \(\psi\)-partial correlation coefficient screening for estimating the adjacency matrix of GGMM method, the default value is 0.05.

iteration

The number of total iterations, the default value is 30.

warm

The number of burn-in iterations, the default value is 20.

Value

RecPre

The output of Recall and Precision values of our proposed method.

Adj

\(p\)x\(p\) Estimated adjacency matrix.

label

The estimated group indices for each observation.

BIC

The BIC scores for determining the number of groups \(M\).

References

Liang, F., Song, Q. and Qiu, P. (2015). An Equivalent Measure of Partial Correlation Coefficients for High Dimensional Gaussian Graphical Models. J. Amer. Statist. Assoc., 110, 1248-1265.

Liang, F. and Zhang, J. (2008) Estimating FDR under general dependence using stochastic approximation. Biometrika, 95(4), 961-977.

Liang, F., Jia, B., Xue, J., Li, Q., and Luo, Y. (2018). An Imputation Regularized Optimization Algorithm for High-Dimensional Missing Data Problems and Beyond. Submitted to Journal of the Royal Statistical Society Series B.

Jia, B. and Liang, F. (2018). Learning Gene Regulatory Networks with High-Dimensional Heterogeneous Data. Accept by ICSA Springer Book.

Examples

Run this code

# NOT RUN {
 
# }
# NOT RUN {
 
# }
# NOT RUN {
library(equSA)
result <- SimHetDat(n = 100, p = 200, M = 3, mu = 0.5, type = "band")
Est <- GGMM(result$data, result$A, M = 3, iteration = 30, warm = 20)
## plot network by our estimated adjacency matrix.
plotGraph(Est$Adj)
## plot the Recall-Precision curve
plot(Est$RecPre[,1], Est$RecPre[,2], type="l", xlab="Recall", ylab="Precision")  
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab