Learn R Programming

XMRF (version 1.0)

XMRF-package: A R Package to Fit Markov Networks to High-throughput Genomics Data

Description

An R package to learn and visualize the underlying relationships between genes from various types of high-throughput genomics data.

Arguments

Details

Package:
XMRF
Type:
Package
Version:
1.0
Date:
2015-06-12
License:
GPL-2

Technological advances have produced large amounts of high-throughput "omics" data that allow us to study the complicated interactions between genes, mutations, aberrations, and epi-genetic markers. Markov Random Fields (MRFs), or Markov Networks, enable us to estimate these genomics networks via sparse, high-dimensional undirected graphical models.

Here, we provide the community a convenient and useful tool to learn the complex genomics networks from various types of high-throughput genomics data. This package encodes the recently proposed parametric family of graphical models based on node-conditional univariate exponential family distributions (Yang et. al, 2012, 2013a). Specifically, our package has methods for estimating Gaussian graphical model (Meinshausen and Buhlmann, 2006), Ising model (Ravikumar et. al, 2010), and the Poisson family graphical models (Allen and Liu, 2012, 2013; Yang et. al 2013b). These models can be used to estimate genetic networks from a variety of data types:

Genomics Data Type
XMRF Family ========================
========== ============
RNA-Seq or miRNA-Seq Counts
LPGM, SPGM Microarray or Methylation array
Continuous GGM
Mutation or CNVs Binary
ISM Genomics Data

To estimate the network structures from different types of genomics data with this package, users simply need to specify the "method" in the main function, for example XMRF(..., method="LPGM") to fit LPGM to next-generation sequencing data.

In this package, we implement the neighborhood selection graph estimation technique by proximal or projected gradient descent using warm starts over the range of regularization parameters. This technique allows estimation of the neighborhood for each node independently and thus can be done in parallel, thus speeding computation.

This package also implements two data-driven approaches to select the sparsity of a fitted network: a stability-based approach for a single regularization value over many bootstrap resamples (Meinshausen and Buhlmann, 2010), or computed over a range of regularization values with StARS (Liu et. al., 2010).

References

Allen, G.I., and Liu, Z. (2012). A Log-Linear graphical model for inferring genetic networks from high-throughput sequencing data. The IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2012).

Allen, G. I., and Liu, Z. (2013). A Local Poisson Graphical Model for Inferring Genetic Networks from Next Generation Sequencing Data. IEEE Transactions on NanoBioscience, 12(3), pp.1-10

Liu, H., Roeder, K., and Wasserman, L. (2010). Stability approach to regularization selection (stars) for high dimensional graphical models. NIPS 23, pp.1432-1440.

Meinshausen, N. and Buhlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3), pp.1436-1462.

Meinshausen, N. and Buhlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), pp.417-473.

Ravikumar, P., Wainwright, M., and Lafferty, J. (2010). High-dimensional ising model selection using l1-regularized logistic regression. The Annals of Statistics, 38(3), pp.1287-1319.

Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2012). Graphical models via generalized linear models. NIPS, 25, pp.1367--1375.

Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2013a). On graphical models via univariate exponential family distributions. arXiv preprint arXiv:1301.4183.

Yang, E., Ravikumar, P.K., Allen, G.I., and Liu, Z. (2013b). On Poisson graphical models. NIPS, pp.1718-1726.

See Also

XMRF

Examples

Run this code
	library(XMRF)
	
	n = 100
	p = 20
	sim <- XMRF.Sim(n=n, p=p, model="LPGM", graph.type="scale-free")
	simDat <- sim$X
	
	# Compute the optimal lambda
	lmax = lambdaMax(t(simDat))
	lambda = 0.01* sqrt(log(p)/n) * lmax
	# Run LPGM
	lpgm.fit <- XMRF(simDat, method="LPGM", N=10, lambda.path=lambda, parallel=FALSE)
	
	# Print the fitted Markov networks
	lpgm.fit
	
	ml = plotNet(sim$B)
	ml = plot(lpgm.fit, mylayout=ml)
	

Run the code above in your browser using DataLab