Learn R Programming

phyclust (version 0.1-34)

.EMControl: EM Control Generator

Description

Generate an EM control (.EMC) controlling the options, methods, conditions and models of EM algorithms. As .EMC, this function generate a default template. One can either modify .EMC or employ this function to control EM algorithms.

Usage

.EMControl(exhaust.iter = 1, fixed.iter = 5,
    short.iter = 100, EM.iter = 1000,
    short.eps = 1e-2, EM.eps = 1e-6,
    cm.reltol = 1e-8, cm.maxit = 5000,
    nm.abstol.Mu.given.QA = 1e-8, nm.reltol.Mu.given.QA = 1e-8,
    nm.maxit.Mu.given.QA = 500,
    nm.abstol.QA.given.Mu = 1e-8, nm.reltol.QA.given.Mu = 1e-8,
    nm.maxit.QA.given.Mu = 5000,
    est.non.seg.site = FALSE, max.init.iter = 50,
    init.procedure = .init.procedure[1],
    init.method = .init.method[1],
    substitution.model = .substitution.model$model[1],
    edist.model = .edist.model[1], identifier = .identifier[1],
    code.type = .code.type[1], em.method = .em.method[1],
    boundary.method = .boundary.method[1], min.n.class = 1,
    se.type = FALSE, se.model = .se.model[1], se.constant = 1e-2)

Value

This function returns a list as .EMC.

The sequencing error controls are stored in

se.type, se.model, and se.constant, for sequencing error type, model, and constrained constant of errors, respectively.

Arguments

exhaust.iter

number of iterations for "exhaustEM", default = 1.

fixed.iter

number of iterations for "RndpEM", default = 5.

short.iter

number of short-EM steps, default = 100.

EM.iter

number of long-EM steps, default = 1000.

short.eps

tolerance of short-EM steps, default = 1e-2.

EM.eps

tolerance of long-EM steps, default = 1e-6.

cm.reltol

relative tolerance for a CM step, default = 1e-8

cm.maxit

maximum number iteration for a CM step, default = 5000.

nm.abstol.Mu.given.QA

see ‘Details’, default = 1e-8

nm.reltol.Mu.given.QA

see ‘Details’, default = 1e-8

nm.maxit.Mu.given.QA

see ‘Details’, default = 500.

nm.abstol.QA.given.Mu

see ‘Details’, default = 1e-8

nm.reltol.QA.given.Mu

see ‘Details’, default = 1e-8

nm.maxit.QA.given.Mu

see ‘Details’, default = 5000.

est.non.seg.site

estimate non-segregation sites, default = FALSE.

max.init.iter

maximum number of initialization iteration, default = 50.

init.procedure

initialization procedure, default = "exhaustEM".

init.method

initialization method, default = "randomMu".

substitution.model

substitution model, default = "JC69".

edist.model

evolution distance, default = D_J69.

identifier

identifier, default = "EE".

code.type

code type, default = "NUCLEOTIDE".

em.method

EM method, default = "EM".

boundary.method

boundary method, default = ADJUST.

min.n.class

minimum number of sequences in a cluster, default = 1.

se.type

sequencing error type, default = FALSE.

se.model

sequencing error model, default = "CONVOLUTION".

se.constant

constrained constant, default = 1e-2.

Author

Wei-Chen Chen wccsnow@gmail.com

Details

exhaust.iter, fixed.iter, short.iter, and short.eps are used to control the iterations of initialization procedures and methods.

EM.iter and EM.eps are used to control the EM iterations.

cm.reltol and cm.maxit are used to control the ECM iterations.

Arguments starting with nm. are options for the Nelder-Mead method as in optim. The C codes of Nelder-Mead are modified from the R math library and the options are all followed. abstol and reltol are for absolute and relative tolerances. Mu.given.QA is for maximizing the profile function of \(\mu_k\) given \(Q_k\), and QA.given.Mu is for maximizing the profile function of \(Q_k\) given \(\mu_k\).

est.non.seg.site indicates whether to estimate the states of center sequences. If FALSE, the states will be fixed as the non segregating sites. Usually, there is no need to estimate.

max.init.iter is for certain initialization methods, e.g. randomNJ and K-Medoids need few tries to obtain an appropriate initial state.

init.procedure and init.method are for initializations.

min.n.class is the minimum number of sequences in a cluster to avoid bad initialization state and degenerated clusters.

se.type, se.model, and se.constant which are used only for sequencing error models and only for nucleotide data without labels.

References

Phylogenetic Clustering Website: https://snoweye.github.io/phyclust/

See Also

.show.option, .EMC, .boundary.method, .code.type, .edist.model, .em.method, .identifier, .init.method, .init.procedure, .substitution.model, optim, phyclust, phyclust.se.

Examples

Run this code
if (FALSE) {
library(phyclust, quiet = TRUE)

# The same as .EMC
.EMControl()

# Except code.type, all others are the same as .EMC
.EMControl(code.type = "SNP")
.EMControl(code.type = .code.type[2])
}

Run the code above in your browser using DataLab