This function is primarily designed to be called by
meandistance.matrix2
, in order to calculate distances between all
possible unambiguous genotypes. Ordinary users won't use
genotypeProbs
unless they are designing a new analysis.
The genotype analyzed is Genotype(object, sample, locus)
. If
the genotype is unambiguous (fully heterozygous or homozygous), a single
unambiguous genotype is returned with a probability of one.
If the genotype is ambiguous (partially heterozygous), a recursive
algorithm is used to generate all possible unambiguous genotypes (all
possible duplications of alleles in the genotype, up to the ploidy of
the individual.)
If the freq
argument is supplied:
The probability of each unambiguous genotype is then calculated from
the allele frequencies of the individual's population, under the
assumption of random mating. Allele frequencies are normalized so that
the frequencies of the alleles in the ambiguous genotype sum to one;
this converts
each frequency to the probability of the allele being present in more
than one copy. The product of these probabilities is multiplied by the
appropriate polynomial coefficient to calculate the probability of the
unambiguous genotype.
$$p = \prod_{i=1}^n f_{i}^{c_i} * \frac{(k-n)!}{\prod_{i=1}^n c_i!}$$
where p is the probability of the unambiguous genotype, n
is the number
of alleles in the ambiguous genotype, f is the normalized frequency of
each allele, c is the number of duplicated copies (total number of
copies minus one) of the allele in the unambiguous genotype, and k is
the ploidy of the individual.
If the gprob
and alleles
arguments are supplied:
The probabilities of all possible genotypes in the population have
already been calculated, based on allele frequencies and selfing rate.
This is done in meandistance.matrix2
using code from De Silva
et al. (2005).
Probabilities for the genotypes of interest (those that the ambiguous
genotype could represent) are normalized to sum to 1, in order to give
the conditional probabilities of the possible genotypes.