Learn R Programming

haplo.stats (version 1.7.6)

haplo.em.control: Create the Control Parameters for the EM Computation of Haplotype Probabilities, with Progressive Insertion of Loci

Description

Create a list of parameters that control the EM algorithm for estimating haplotype frequencies, based on progressive insertion of loci. Non-default parameters for the EM algorithm can be set as parameters passed to haplo.em.control.

Usage

haplo.em.control(loci.insert.order=NULL, insert.batch.size = 6,
                             min.posterior = 1e-09, tol = 1e-05,
                             max.iter=5000, random.start=0, n.try = 10,
                             iseed=NULL, max.haps.limit=2e6, verbose=0)

Arguments

loci.insert.order
Numeric vector with specific order to insert the loci. If this value is NULL, the insert order will be in sequential order (1, 2, ..., No. Loci).
insert.batch.size
Number of loci to be inserted in a single batch.
min.posterior
Minimum posterior probability of a haplotype pair, conditional on observed marker genotypes. Posteriors below this minimum value will have their pair of haplotypes "trimmed" off the list of possible pairs. If all markers in low LD, we recommend using the
tol
If the change in log-likelihood value between EM steps is less than the tolerance (tol), it has converged.
max.iter
Maximum number of iterations allowed for the EM algorithm before it stops and prints an error. If the error is printed, double max.iter.
random.start
If random.start = 0, then the inititial starting values of the posteriors for the first EM attempt will be based on assuming equal posterior probabilities (conditional on genotypes). If random.start = 1, then the initial starting values of the first EM a
n.try
Number of times to try to maximize the lnlike by the EM algorithm. The first try uses, as initial starting values for the posteriors, either equal values or uniform random variables, as determined by random.start. All subsequent tries will use random unif
iseed
An integer or a saved copy of .Random.seed. This allows simulations to be reproduced by using the same initial seed.
max.haps.limit
Maximum number of haplotypes for the input genotypes. It is used as the amount of memory to allocate in C for the progressive-insertion E-M steps. Within haplo.em, the first step is to try to allocate the sum of the result of geno.count.pairs(), if that
verbose
Logical, if TRUE, print procedural messages to the screen. If FALSE, do not print any messages.

Value

  • A list of the parameters passed to the function.

Details

The default is to use n.try = 10. If this takes too much time, it may be worthwhile to decrease n.try. Other tips for computing haplotype frequencies for a large number of loci, particularly if some have many alleles, is to decrease the batch size (insert.batch.size), increase the memory (max.haps.limit), and increase the probability of trimming off rare haplotypes at each insertion step (min.posterior).

See Also

haplo.em, haplo.score

Examples

Run this code
# This is how it is used within haplo.score
#    > score.gauss <- haplo.score(resp, geno, trait.type="gaussian", 
#    >           em.control=haplo.em.control(insert.batch.size = 2, n.try=1))

Run the code above in your browser using DataLab