This function calculates the distance between two individuals at one microsatellite locus using a method based on that of Bruvo et al. (2004).
Bruvo.distance(genotype1, genotype2, maxl=10, usatnt=2, missing=-9)
A number ranging from 0 to 1, with 0 indicating identical
genotypes, and 1 being a theoretical maximum distance if all alleles
from genotype1
differed by an infinite number of repeats from all
alleles in genotype2
. NA
is returned if both genotypes have
more than maxl
alleles or if either genotype has the symbol for
missing data as its first allele.
A vector of alleles for one individual at one locus. Allele length is in nucleotides or repeat count. Each unique allele corresponds to one element in the vector, and the vector is no longer than it needs to be to contain all unique alleles for this individual at this locus.
A vector of alleles for another individual at the same locus.
If both individuals have more than this number of
alleles at this locus, NA
is returned instead of a
numerical distance.
Length of the repeat at this locus. For example
usatnt=2
for dinucleotide repeats, and usatnt=3
for trinucleotide repeats. If the alleles in genotype1
and genotype2
are expressed in repeat count instead of
nucleotides, set usatnt=1
.
A numerical value that, when in the first allele
position, indicates missing data. NA
is returned if this
value is found in either genotype.
Lindsay V. Clark
Since allele copy number is frequently unknown in polyploid microsatellite data, Bruvo et al. developed a measure of genetic distance similar to band-sharing indices used with dominant data, but taking into account mutational distances between alleles. A matrix is created containing all differences in repeat count between the alleles of two individuals at one locus. These differences are then geometrically transformed to reflect the probabilities of mutation from one allele to another. The matrix is then searched to find the minimum sum if each allele from one individual is paired to one allele from the other individual. This sum is divided by the number of alleles per individual.
If one genotype has more alleles than the other, ‘virtual alleles’ must
be created so that both genotypes are the same length. There are
three options for the value of these virtual alleles, but
Bruvo.distance
only implements the simplest one, assuming that it is
not known whether differences in ploidy arose from genome addition or
genome loss. Virtual alleles are set to infinity, such that the
geometric distance between any allele and a virtual allele is 1.
In the original publication by Bruvo et al. (2004), ambiguous
genotypes were dealt with by calculating the distance for all possible
unambiguous genotype combinations and averaging across all of them
equally. When Bruvo.distance
is called from
meandistance.matrix
, ploidy is unknown and all genotypes are
simply treated as if they had one copy of each allele. When
Bruvo.distance
is called from meandistance.matrix2
, the
analysis is truer to the original, in that ploidy is known and all
possible unambiguous genotype combinations are considered. However,
instead of all possible unambiguous genotypes being weighted equally,
in meandistance.matrix2
they are weighted based on allele
frequencies and selfing rate, since some unambiguous genotypes are
more likely than others.
Bruvo, R., Michiels, N. K., D'Sousa, T. G., and Schulenberg, H. (2004) A simple method for calculation of microsatellite genotypes irrespective of ploidy level. Molecular Ecology 13, 2101-2106.
meandistance.matrix
, Lynch.distance
,
Bruvo2.distance
Bruvo.distance(c(202,206,210,220),c(204,206,216,222))
Bruvo.distance(c(202,206,210,220),c(204,206,216,222),usatnt=4)
Bruvo.distance(c(202,206,210,220),c(204,206,222))
Bruvo.distance(c(202,206,210,220),c(204,206,216,222),maxl=3)
Bruvo.distance(c(202,206,210,220),c(-9))
Run the code above in your browser using DataLab