Bruvo.distance: Genetic Distance Metric of Bruvo et al.

Description

This function calculates the distance between two individuals at one microsatellite locus using a method based on that of Bruvo et al. (2004).

Usage

Bruvo.distance(genotype1, genotype2, maxl=10, usatnt=2, missing=-9)

Value

A number ranging from 0 to 1, with 0 indicating identical genotypes, and 1 being a theoretical maximum distance if all alleles from genotype1 differed by an infinite number of repeats from all alleles in genotype2. NA is returned if both genotypes have more than maxl alleles or if either genotype has the symbol for missing data as its first allele.

Arguments

genotype1: A vector of alleles for one individual at one locus. Allele length is in nucleotides or repeat count. Each unique allele corresponds to one element in the vector, and the vector is no longer than it needs to be to contain all unique alleles for this individual at this locus.
genotype2: A vector of alleles for another individual at the same locus.
maxl: If both individuals have more than this number of alleles at this locus, NA is returned instead of a numerical distance.
usatnt: Length of the repeat at this locus. For example usatnt=2 for dinucleotide repeats, and usatnt=3 for trinucleotide repeats. If the alleles in genotype1 and genotype2 are expressed in repeat count instead of nucleotides, set usatnt=1.
missing: A numerical value that, when in the first allele position, indicates missing data. NA is returned if this value is found in either genotype.

Author

Lindsay V. Clark

Details

Since allele copy number is frequently unknown in polyploid microsatellite data, Bruvo et al. developed a measure of genetic distance similar to band-sharing indices used with dominant data, but taking into account mutational distances between alleles. A matrix is created containing all differences in repeat count between the alleles of two individuals at one locus. These differences are then geometrically transformed to reflect the probabilities of mutation from one allele to another. The matrix is then searched to find the minimum sum if each allele from one individual is paired to one allele from the other individual. This sum is divided by the number of alleles per individual.

If one genotype has more alleles than the other, ‘virtual alleles’ must be created so that both genotypes are the same length. There are three options for the value of these virtual alleles, but Bruvo.distance only implements the simplest one, assuming that it is not known whether differences in ploidy arose from genome addition or genome loss. Virtual alleles are set to infinity, such that the geometric distance between any allele and a virtual allele is 1.

In the original publication by Bruvo et al. (2004), ambiguous genotypes were dealt with by calculating the distance for all possible unambiguous genotype combinations and averaging across all of them equally. When Bruvo.distance is called from meandistance.matrix, ploidy is unknown and all genotypes are simply treated as if they had one copy of each allele. When Bruvo.distance is called from meandistance.matrix2, the analysis is truer to the original, in that ploidy is known and all possible unambiguous genotype combinations are considered. However, instead of all possible unambiguous genotypes being weighted equally, in meandistance.matrix2 they are weighted based on allele frequencies and selfing rate, since some unambiguous genotypes are more likely than others.

References

Bruvo, R., Michiels, N. K., D'Sousa, T. G., and Schulenberg, H. (2004) A simple method for calculation of microsatellite genotypes irrespective of ploidy level. Molecular Ecology 13, 2101-2106.

Examples

Run this code

  Bruvo.distance(c(202,206,210,220),c(204,206,216,222))
  Bruvo.distance(c(202,206,210,220),c(204,206,216,222),usatnt=4)
  Bruvo.distance(c(202,206,210,220),c(204,206,222))
  Bruvo.distance(c(202,206,210,220),c(204,206,216,222),maxl=3)
  Bruvo.distance(c(202,206,210,220),c(-9))

Run the code above in your browser using DataLab