.EMControl: EM Control Generator

Description

Generate an EM control (.EMC) controlling the options, methods, conditions and models of EM algorithms. As .EMC, this function generate a default template. One can either modify .EMC or employ this function to control EM algorithms.

Usage

.EMControl(exhaust.iter = 1, fixed.iter = 5,
    short.iter = 100, EM.iter = 1000,
    short.eps = 1e-2, EM.eps = 1e-6,
    cm.reltol = 1e-8, cm.maxit = 5000,
    nm.abstol.Mu.given.QA = 1e-8, nm.reltol.Mu.given.QA = 1e-8,
    nm.maxit.Mu.given.QA = 500,
    nm.abstol.QA.given.Mu = 1e-8, nm.reltol.QA.given.Mu = 1e-8,
    nm.maxit.QA.given.Mu = 5000,
    est.non.seg.site = FALSE, max.init.iter = 50,
    init.procedure = .init.procedure[1],
    init.method = .init.method[1],
    substitution.model = .substitution.model$model[1],
    edist.model = .edist.model[1], identifier = .identifier[1],
    code.type = .code.type[1], em.method = .em.method[1],
    boundary.method = .boundary.method[1], min.n.class = 1,
    se.type = FALSE, se.model = .se.model[1], se.constant = 1e-2)

Value

This function returns a list as .EMC.

The sequencing error controls are stored in

se.type, se.model, and se.constant, for sequencing error type, model, and constrained constant of errors, respectively.

Arguments

exhaust.iter: number of iterations for "exhaustEM", default = 1.
fixed.iter: number of iterations for "RndpEM", default = 5.
short.iter: number of short-EM steps, default = 100.
EM.iter: number of long-EM steps, default = 1000.
short.eps: tolerance of short-EM steps, default = 1e-2.
EM.eps: tolerance of long-EM steps, default = 1e-6.
cm.reltol: relative tolerance for a CM step, default = 1e-8
cm.maxit: maximum number iteration for a CM step, default = 5000.
nm.abstol.Mu.given.QA: see ‘Details’, default = 1e-8
nm.reltol.Mu.given.QA: see ‘Details’, default = 1e-8
nm.maxit.Mu.given.QA: see ‘Details’, default = 500.
nm.abstol.QA.given.Mu: see ‘Details’, default = 1e-8
nm.reltol.QA.given.Mu: see ‘Details’, default = 1e-8
nm.maxit.QA.given.Mu: see ‘Details’, default = 5000.
est.non.seg.site: estimate non-segregation sites, default = FALSE.
max.init.iter: maximum number of initialization iteration, default = 50.
init.procedure: initialization procedure, default = "exhaustEM".
init.method: initialization method, default = "randomMu".
substitution.model: substitution model, default = "JC69".
edist.model: evolution distance, default = D_J69.
identifier: identifier, default = "EE".
code.type: code type, default = "NUCLEOTIDE".
em.method: EM method, default = "EM".
boundary.method: boundary method, default = ADJUST.
min.n.class: minimum number of sequences in a cluster, default = 1.
se.type: sequencing error type, default = FALSE.
se.model: sequencing error model, default = "CONVOLUTION".
se.constant: constrained constant, default = 1e-2.

Author

Wei-Chen Chen wccsnow@gmail.com

Details

exhaust.iter, fixed.iter, short.iter, and short.eps are used to control the iterations of initialization procedures and methods.

EM.iter and EM.eps are used to control the EM iterations.

cm.reltol and cm.maxit are used to control the ECM iterations.

Arguments starting with nm. are options for the Nelder-Mead method as in optim. The C codes of Nelder-Mead are modified from the R math library and the options are all followed. abstol and reltol are for absolute and relative tolerances. Mu.given.QA is for maximizing the profile function of \(\mu_k\) given \(Q_k\), and QA.given.Mu is for maximizing the profile function of \(Q_k\) given \(\mu_k\).

est.non.seg.site indicates whether to estimate the states of center sequences. If FALSE, the states will be fixed as the non segregating sites. Usually, there is no need to estimate.

max.init.iter is for certain initialization methods, e.g. randomNJ and K-Medoids need few tries to obtain an appropriate initial state.

init.procedure and init.method are for initializations.

min.n.class is the minimum number of sequences in a cluster to avoid bad initialization state and degenerated clusters.

se.type, se.model, and se.constant which are used only for sequencing error models and only for nucleotide data without labels.

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

Examples

Run this code

if (FALSE) {
library(phyclust, quiet = TRUE)

# The same as .EMC
.EMControl()

# Except code.type, all others are the same as .EMC
.EMControl(code.type = "SNP")
.EMControl(code.type = .code.type[2])
}

Run the code above in your browser using DataLab