Package: |
XMRF |
Type: |
Package |
Version: |
1.0 |
Date: |
2015-06-12 |
License: |
GPL-2 |
Technological advances have produced large amounts of high-throughput "omics" data that allow us to study the complicated interactions between genes, mutations, aberrations, and epi-genetic markers. Markov Random Fields (MRFs), or Markov Networks, enable us to estimate these genomics networks via sparse, high-dimensional undirected graphical models.
Here, we provide the community a convenient and useful tool to learn the complex genomics networks from various types of high-throughput genomics data. This package encodes the recently proposed parametric family of graphical models based on node-conditional univariate exponential family distributions (Yang et. al, 2012, 2013a). Specifically, our package has methods for estimating Gaussian graphical model (Meinshausen and Buhlmann, 2006), Ising model (Ravikumar et. al, 2010), and the Poisson family graphical models (Allen and Liu, 2012, 2013; Yang et. al 2013b). These models can be used to estimate genetic networks from a variety of data types:
Genomics Data | Type |
XMRF Family | ======================== |
========== | ============ |
RNA-Seq or miRNA-Seq | Counts |
LPGM , SPGM |
Microarray or Methylation array |
Continuous | GGM |
Mutation or CNVs | Binary |
ISM |
Genomics Data |
To estimate the network structures from different types of genomics data with this package, users simply need to specify the "method"
in the main function,
for example XMRF(..., method="LPGM")
to fit LPGM to next-generation sequencing data.
In this package, we implement the neighborhood selection graph estimation technique by proximal or projected gradient descent using warm starts over the range of regularization parameters. This technique allows estimation of the neighborhood for each node independently and thus can be done in parallel, thus speeding computation.
This package also implements two data-driven approaches to select the sparsity of a fitted network: a stability-based approach for a single regularization value over many bootstrap resamples (Meinshausen and Buhlmann, 2010), or computed over a range of regularization values with StARS (Liu et. al., 2010).
Allen, G. I., and Liu, Z. (2013). A Local Poisson Graphical Model for Inferring Genetic Networks from Next Generation Sequencing Data. IEEE Transactions on NanoBioscience, 12(3), pp.1-10
Liu, H., Roeder, K., and Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. NIPS 23, pp.1432-1440.
Meinshausen, N. and Buhlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3), pp.1436-1462.
Meinshausen, N. and Buhlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), pp.417-473.
Ravikumar, P., Wainwright, M., and Lafferty, J. (2010). High-dimensional ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), pp.1287-1319.
Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2012). Graphical models via generalized linear models. NIPS, 25, pp.1367--1375.
Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2013a). On graphical models via univariate exponential family distributions. arXiv preprint arXiv:1301.4183.
Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2013b). On Poisson graphical models. NIPS, pp.1718-1726.
XMRF
library(XMRF)
n = 100
p = 20
sim <- XMRF.Sim(n=n, p=p, model="LPGM", graph.type="scale-free")
simDat <- sim$X
# Compute the optimal lambda
lmax = lambdaMax(t(simDat))
lambda = 0.01* sqrt(log(p)/n) * lmax
# Run LPGM
lpgm.fit <- XMRF(simDat, method="LPGM", N=10, lambda.path=lambda, parallel=FALSE)
# Print the fitted Markov networks
lpgm.fit
ml = plotNet(sim$B)
ml = plot(lpgm.fit, mylayout=ml)
Run the code above in your browser using DataLab