AddAlleleLinkages
finds alleles, if any, in linkage disequilibrium
with each allele in a RADdata
object, and computes a correlation
coefficient representing the strength of the linkage.
AddGenotypePriorProb_LD
adds a second set of prior genotype
probabilities to a RADdata
object based on the genotype posterior
probabilities at linked alleles.
AddAlleleLinkages(object, ...)
# S3 method for RADdata
AddAlleleLinkages(object, type, linkageDist, minCorr,
excludeTaxa = character(0), …)
AddGenotypePriorProb_LD(object, ...)
# S3 method for RADdata
AddGenotypePriorProb_LD(object, type, …)
A RADdata
object with genomic alignment data stored in
object$locTable$Chr
and object$locTable$pos
.
A character string, either “mapping”, “hwe”, or “popstruct”, to indicate the type of population being analyzed.
A number, indicating the distance in basepairs from a locus within which to search for linked alleles.
A number ranging from zero to one indicating the minimum correlation needed for an allele to be used for genotype prediction at another allele.
A character vector listing taxa to be excluded from correlation estimates.
Additional arguments (none implemented).
A RADdata
object is returned. For AddAlleleLinkages
, it has a new slot
called $alleleLinkages
that is a list, with one item in the list for each
allele in the dataset. Each item is a data frame, with indices for linked alleles
in the first column, and correlation coefficients in the second column.
For AddGenotypePriorProb_LD
, the object has a new slot called
$priorProbLD
. This is a list much like $posteriorProb
, with one list
item per inheritance mode, and each item being an array with allele copy number in
the first dimension, taxa in the second dimension, and alleles in the third dimension.
Values indicate genotype prior probabilities based on linked alleles alone.
These functions are primarily designed to be used internally by the pipeline functions.
AddAlleleLinkages
obtains genotypic values using
GetWeightedMeanGenotypes
, then regresses those values for a given
allele against those values for nearby alleles to obtain correlation coefficients.
For the population structure model, the genotypic values for an allele are
first regressed on the PC axes from object$PCA
, then the residuals are
regressed on the genotypic values at nearby alleles to obtain correlation
coefficients.
AddGenotypePriorProb_LD
makes a second set of priors in addition to
object$priorProb
. This second set of priors has one value per
inheritance mode per taxon per allele per possible allele copy number.
Where \(K\) is the ploidy, with allele copy number \(c\) ranging from 0 to
\(K\), \(i\) is an allele, \(j\) is a linked allele at a different locus
out of \(J\) total alleles linked to \(i\),
\(r_{ij}\) is the correlation coefficient between those alleles, \(t\) is a
taxon, \(post_{cjt}\) is the posterior probability of a given allele copy
number for a given allele in a given taxon, and \(prior_{cit}\) is the
prior probability for a given allele copy number for a given allele in a given
taxon based on linkage alone:
$$prior_{cit} = \frac{\prod_{j = 1}^J{post_{cjt} * r_{ij} + (1 - r_{ij})/(K + 1)}}{\sum_{c = 0}^K{\prod_{j = 1}^J{post_{cjt} * r_{ij} + (1 - r_{ij})/(K + 1)}}}$$
For mapping populations, AddGenotypePriorProb_LD
uses the above formula
when each allele only has two possible genotypes (i.e. test-cross segregation).
When more genotypes are possible, AddGenotypePriorProb_LD
instead estimates
prior probabilities as fitted values when the posterior probabilities for
a given allele are regressed on the posterior probabilities for a linked allele.
This allows loci with different segregation patterns to be informative for
predicting genotypes, and for cases where two alleles are in phase for some but not
all parental copies.
# NOT RUN {
# load example dataset
data(Msi01genes)
# Run non-LD pop structure pipeline
Msi01genes <- IteratePopStruct(Msi01genes, tol = 0.01, nPcsInit = 10)
# Add linkages
Msi01genes <- AddAlleleLinkages(Msi01genes, "popstruct", 1e4, 0.05)
# Get new posterior probabilities based on those linkages
Msi01genes <- AddGenotypePriorProb_LD(Msi01genes, "popstruct")
# Preview results
Msi01genes$priorProbLD[[1]][,1:10,1:10]
# }
Run the code above in your browser using DataLab