Learn R Programming

epiGWAS (version 1.0.2)

fast_HMM: Fits a HMM to a genotype dataset by calling fastPHASE

Description

In this function, we fit the fastPHASE hidden Markov model (HMM) using the EM algorithm. The fastPHASE executable is required to run fast_HMM. It can be downloaded from the following web page: http://scheet.org/software.html

Usage

fast_HMM(X, out_path = NULL, X_filename = NULL,
  fp_path = "bin/fastPHASE", n_state = 12, n_iter = 25)

Arguments

X

genotype matrix

out_path

prefix for the fitted parameters filenames. If NULL, the files are saved in a temporary directory.

X_filename

filename for the fastPHASE-formatted genotype file. If NULL, the file is created in a temporary directory.

fp_path

path to the fastPHASE executable

n_state

dimensionality of the latent space

n_iter

number of iterations for the EM algorithm

Value

Fitted parameters of the fastPHASE HMM. They are grouped in a list with the following fields: pInit for the initial marginal distribution, the three-dimensional array Q for the transition probabilities and finally pEmit, another three-dimensional array for the emission probabilities

Details

Because of the quadratic complexity of the forward algorithm in terms of the dimensionality of the latent space n_state, we recommend setting this parameter to 12. Choosing a higher number does not result in a dramatic increase of performance. An optimal choice for the number of iterations for the EM algorithm is between 20 and 25.

References

Scheet, P., & Stephens, M. (2006). A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. American Journal of Human Genetics, 78(4), 629<U+2013>644.

Examples

Run this code
# NOT RUN {
p <- 50
n <- 100
genotypes <- matrix((runif(n * p, min = 0, max = 1) < 0.5) +
            (runif(n * p, min = 0, max = 1) < 0.5),
            nrow = n, dimnames = list(NULL, paste0("SNP_", seq_len(p))))

hmm <- fast_HMM(genotypes, fp_path = "/path/to/fastPHASE",
                n_state = 4, n_iter = 10)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab