Learn R Programming

phyclust (version 0.1-34)

phyclust-package: Phyloclustering -- Phylogenetic Clustering

Description

This package phyclust (Chen 2011) implements an novel approach combining model-based clusterings and phylogenetics to classify DNA sequences and SNP sequences. Based on evolution models, sequences are assumed to follow a mutation process/distribution clouding around an unknown center ancestor. Based on Continuous Time Markov Chain Theory, mixture distributions are established to model/classify subpopulations or population structures.

The kernel part of the package are implemented in C. EM algorithms are performed to find the maximum likelihood estimators. Initialization methods for EM algorithms are also established. Several evolution models are also developed.

ms (Hudson 2002) and seq-gen (Rambaut and Grassly 1997) are two useful programs to generate coalescent trees and sequences, and both are merged into phyclust. baseml of PAML (Yang 1997, 2007) is also ported into phyclust and it is a program to find a phylogenetic tree by maximizing likelihood. Hap-Clustering method (Tzeng 2005) for haplotype grouping is also incorporated into phyclust.

Type help(package = phyclust) to see a list of major functions for which further documentations are available. The on-line detail instructions are also available and the link is given below in the ‘References’ section.

Some C and R functions and R classes of the ape package are also required and modified in phyclust.

Arguments

Author

Wei-Chen Chen wccsnow@gmail.com

Details

The main function is phyclust controlled by an object .EMC generated by a function .EMControl, and find.best can find the best solution by repeating phyclust with different initializations.

ms and seqgen can generate trees and sequences based on varied conditions, and they can jointly perform simulations.

paml.baseml can estimate trees based on sequences.

haplo.post.prob is a modified version of Tzeng's method for haplotype grouping which uses a evolution approach to group SNP sequences.

Some tool functions of the ape package are utilized in this package to perform trees in plots, check object types, and read sequence data.

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

Chen, W.-C. (2011) “Overlapping codon model, phylogenetic clustering, and alternative partial expectation conditional maximization algorithm”, Ph.D. Diss., Iowa Stat University.

Hudson, R.R. (2002) “Generating Samples under a Wright-Fisher Neutral Model of Genetic Variation”, Bioinformatics, 18, 337-338. http://home.uchicago.edu/~rhudson1/source.html

Rambaut, A. and Grassly, N.C. (1997) “Seq-Gen: An Application for the Monte Carlo Simulation of DNA Sequence Evolution along Phylogenetic Trees”, Computer Applications In The Biosciences, 13:3, 235-238. http://tree.bio.ed.ac.uk/software/seqgen/

Yang, Z. (1997) “PAML: a program package for phylogenetic analysis by maximum likelihood”, Computer Applications in BioSciences, 13, 555-556. http://abacus.gene.ucl.ac.uk/software/paml.html

Yang, Z. (2007) “PAML 4: a program package for phylogenetic analysis by maximum likelihood”, Molecular Biology and Evolution, 24, 1586-1591.

Tzeng, J.Y. (2005) “Evolutionary-Based Grouping of Haplotypes in Association Analysis”, Genetics Epidemiology, 28, 220-231. https://www4.stat.ncsu.edu/~jytzeng/software.php

Paradis E., Claude J., and Strimmer K. (2004) “APE: analyses of phylogenetics and evolution in R language”, Bioinformatics, 20, 289-290. http://ape-package.ird.fr/

See Also

phyclust, .EMC, .EMControl, find.best.

Examples

Run this code
if (FALSE) {
library(phyclust, quiet = TRUE)

demo(package = "phyclust")
demo("ex_trees", package = "phyclust")
}

Run the code above in your browser using DataLab