Calculate the Index of Association and Standardized Index of Association.
ia()
calculates the index of association over all loci in
the data set.
pair.ia()
calculates the index of association in a pairwise
manner among all loci.
resample.ia()
calculates the index of association on a
reduced data set multiple times to create a distribution, showing the
variation of values observed at a given sample size (previously
jack.ia
).
ia(gid, sample = 0, method = 1, quiet = FALSE, missing = "ignore",
plot = TRUE, hist = TRUE, index = "rbarD", valuereturn = FALSE)pair.ia(gid, quiet = FALSE, plot = TRUE, low = "blue", high = "red",
limits = NULL, index = "rbarD")
resample.ia(gid, n = NULL, reps = 999, quiet = FALSE, use_psex = FALSE,
...)
jack.ia(gid, n = NULL, reps = 999, quiet = FALSE)
an integer indicating the number of permutations desired (eg 999).
an integer from 1 to 4 indicating the sampling method desired.
see shufflepop
for details.
Should the function print anything to the screen while it is performing calculations?
TRUE
prints nothing.
FALSE
(default) will print the population name and progress bar.
a character string. see missingno
for details.
When TRUE
(default), a heatmap of the values per locus
pair will be plotted (for pair.ia). For `ia()`, if sampling > 0
, a
histogram will be produced for each population.
logical
Deprecated. Use plot.
character
either "Ia" or "rbarD". If hist = TRUE
,
this indicates which index you want represented in the plot (default:
"rbarD").
logical
if TRUE
, the index values from the
reshuffled data is returned. If FALSE
(default), the index is
returned with associated p-values in a 4 element numeric vector.
(for pair.ia) a color to use for low values when plot =
TRUE
(for pair.ia) a color to use for low values when plot =
TRUE
(for pair.ia) the limits to be used for the color scale.
Defaults to NULL
. If you want to use a custom range, supply two
numbers between -1 and 1, (e.g. limits = c(-0.15, 1)
)
an integer specifying the number of samples to be drawn. Defaults to
NULL
, which then uses the number of multilocus genotypes.
an integer specifying the number of replicates to perform. Defaults to 999.
a logical. If TRUE
, the samples will be weighted by the value
of psex. Defaults to FALSE
.
arguments passed on to psex
pair.ia
A matrix with two columns and choose(nLoc(gid), 2) rows representing the values for Ia and rbarD per locus pair.
Ia - numeric. The index of association.
p.Ia - A number indicating the p-value resulting from a one-sided permutation test based on the number of samples indicated in the original call.
rbarD - numeric. The standardized index of association.
p.rD - A factor indicating the p-value resulting from a one-sided permutation test based on the number of samples indicated in the original call.
index The above vector
samples A data frame with s by 2 column data frame where s is the number of samples defined. The columns are for the values of Ia and rbarD, respectively.
The index of association was originally developed by A.H.D. Brown
analyzing population structure of wild barley (Brown, 1980). It has been widely
used as a tool to detect clonal reproduction within populations .
Populations whose members are undergoing sexual reproduction, whether it be
selfing or out-crossing, will produce gametes via meiosis, and thus have a
chance to shuffle alleles in the next generation. Populations whose members
are undergoing clonal reproduction, however, generally do so via mitosis.
This means that the most likely mechanism for a change in genotype is via
mutation. The rate of mutation varies from species to species, but it is
rarely sufficiently high to approximate a random shuffling of alleles. The
index of association is a calculation based on the ratio of the variance of
the raw number of differences between individuals and the sum of those
variances over each locus . You can also think of it as the observed
variance over the expected variance. If they are the same, then the index
is zero after subtracting one (from Maynard-Smith, 1993):
The calculation for the distance between two individuals at a single locus
with a allelic states and a ploidy of k is as follows (except
for Presence/Absence data):
These values are calculated over all possible combinations of individuals
in the data set,
Calculating the expected variance is the sum of each of the variances of the individual loci. The calculation at a single locus, j is the same as the previous equation, substituting values of D for d:
The expected variance is then the sum of all the variances over all m loci:
Agapow and Burt showed that
Paul-Michael Agapow and Austin Burt. Indices of multilocus linkage disequilibrium. Molecular Ecology Notes, 1(1-2):101-102, 2001
A.H.D. Brown, M.W. Feldman, and E. Nevo. Multilocus structure of natural populations of Hordeum spontaneum. Genetics, 96(2):523-536, 1980.
J M Smith, N H Smith, M O'Rourke, and B G Spratt. How clonal are bacteria? Proceedings of the National Academy of Sciences, 90(10):4384-4388, 1993.
poppr
, missingno
,
import2genind
, read.genalex
,
clonecorrect
, win.ia
, samp.ia
# NOT RUN {
data(nancycats)
ia(nancycats)
# Pairwise over all loci:
data(partial_clone)
res <- pair.ia(partial_clone)
plot(res, low = "black", high = "green", index = "Ia")
# Resampling
data(Pinf)
resample.ia(Pinf, reps = 99)
# }
# NOT RUN {
# Plot the results of resampling rbarD.
library("ggplot2")
Pinf.resamp <- resample.ia(Pinf, reps = 999)
ggplot(Pinf.resamp[2], aes(x = rbarD)) +
geom_histogram() +
geom_vline(xintercept = ia(Pinf)[2]) +
geom_vline(xintercept = ia(clonecorrect(Pinf))[2], linetype = 2) +
xlab(expression(bar(r)[d]))
# Get the indices back and plot the distributions.
nansamp <- ia(nancycats, sample = 999, valuereturn = TRUE)
plot(nansamp, index = "Ia")
plot(nansamp, index = "rbarD")
# You can also adjust the parameters for how large to display the text
# so that it's easier to export it for publication/presentations.
library("ggplot2")
plot(nansamp, labsize = 5, linesize = 2) +
theme_bw() + # adding a theme
theme(text = element_text(size = rel(5))) + # changing text size
theme(plot.title = element_text(size = rel(4))) + # changing title size
ggtitle("Index of Association of nancycats") # adding a new title
# Get the index for each population.
lapply(seppop(nancycats), ia)
# With sampling
lapply(seppop(nancycats), ia, sample = 999)
# Plot pairwise ia for all populations in a grid with cowplot
# Set up the library and data
library("cowplot")
data(monpop)
splitStrata(monpop) <- ~Tree/Year/Symptom
setPop(monpop) <- ~Tree
# Need to set up a list in which to store the plots.
plotlist <- vector(mode = "list", length = nPop(monpop))
names(plotlist) <- popNames(monpop)
# Loop throgh the populations, calculate pairwise ia, plot, and then
# capture the plot in the list
for (i in popNames(monpop)){
x <- pair.ia(monpop[pop = i], limits = c(-0.15, 1)) # subset, calculate, and plot
plotlist[[i]] <- ggplot2::last_plot() # save the last plot
}
# Use the plot_grid function to plot.
plot_grid(plotlist = plotlist, labels = paste("Tree", popNames(monpop)))
# }
Run the code above in your browser using DataLab