argmax.geno(cross, step=0, off.end=0, error.prob=0.0001, map.function=c("haldane","kosambi","c-f","morgan"), stepwidth=c("fixed", "variable", "max"))
cross
. See
read.cross
for details.step=0
, genotypes
are reconstructed only at the marker locations."fixed"
; "variable"
is included for the qtlbim
package (http://www.ssg.uab.edu/qtlbim). The "max"
option inserts the minimal number of intermediate points so that the
maximum distance between points is step
.cross
object is returned with a component,
argmax
, added to each component of cross$geno
.
The argmax
component is a matrix of size [n.ind x n.pos], where
n.pos is the
number of positions at which the reconstructed genotypes were obtained,
containing the most likely sequences of underlying genotypes.
Attributes "error.prob"
, "step"
, and "off.end"
are set to the values of the corresponding arguments, for later
reference.
step
is small but
positive. One may observe quite different results for different values
of step
. The problem is that, in the presence of data like A----H
, the
sequences AAAAAA
and HHHHHH
may be more likely than any
one of the sequences AAAAAH
, AAAAHH
, AAAHHH
,
AAHHHH
, AHHHHH
, AAAAAH
. The Viterbi algorithm
produces a single "most likely" sequence of underlying genotypes.This is done by calculating $ Q[k](v[k]) = max{v[1], \ldots, v[k-1]} Pr(g[1] = v[1], \ldots, g[k] = v[k], O[1], \ldots, O[k])$ for $k = 1, \ldots, n$ and then tracing back through the sequence.
Rabiner, L. R. (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257--286.
sim.geno
, calc.genoprob
,
fill.geno
data(fake.f2)
fake.f2 <- argmax.geno(fake.f2, step=2, off.end=5, err=0.01)
Run the code above in your browser using DataLab