Poppr provides tools for population genetic analysis that include genotypic diversity measures, genetic distances with bootstrap support, native organization and handling of population hierarchies, and clone correction.
To cite poppr, please use citation("poppr")
. When referring to
poppr in your manuscript, please use lower case unless it occurs at the
beginning of a sentence.
Below are descriptions and links to functions found in poppr. Be aware that all functions in adegenet are also available. The functions are documented as:
function_name()
(data type) - Description
Where ‘data type’ refers to the type of data that can be used:
m | a genclone or genind object |
s | a snpclone or genlight object |
x | a different data type (e.g. a matrix from mlg.table() ) |
getfile()
(x) - Provides a quick GUI to grab files for import
read.genalex()
(x) - Reads GenAlEx formatted csv files to a genind object
genind2genalex()
(m) - Converts genind objects to GenAlEx formatted csv files
genclone2genind()
(m) - Removes the @mlg slot from genclone objects
as.genambig()
(m) - Converts genind data to polysat's genambig data structure.
bootgen2genind()
(x) - see aboot()
for details)
Data structures "genclone" (based off of adegenet's genind) and "snpclone" (based off of adegenet's genlight for large SNP data sets). Both of these data structures are defined by the presence of an extra MLG slot representing multilocus genotype assignments, which can be a numeric vector or a MLG class object.
genclone - Handles microsatellite, presence/absence, and small SNP data sets
snpclone - Designed to handle larger binary SNP data sets.
MLG - An internal class holding a data frame of multilocus genotype assignments that acts like a vector, allowing the user to easily switch between different MLG definitions.
bootgen - An internal class used explicitly for aboot()
that
inherits the gen-class virtual object. It is
designed to allow for sampling loci with replacement.
bruvomat - An internal class designed to handle bootstrapping for Bruvo's distance where blocks of integer loci can be shuffled.
as.genclone()
(m) - Converts genind objects to genclone objects
missingno()
(m) - Handles missing data
clonecorrect()
(m | s) - Clone-censors at a specified population hierarchy
informloci()
(m) - Detects and removes phylogenetically uninformative loci
popsub()
(m | s) - Subsets genind objects by population
shufflepop()
(m) - Shuffles genotypes at each locus using four different shuffling algorithms
recode_polyploids()
(m | x) - Recodes polyploid data sets with missing alleles imported as "0"
make_haplotypes()
(m | s) - Splits data into pseudo-haplotypes. This is mainly used in AMOVA.
test_replen()
(m) - Tests for inconsistent repeat lengths in microsatellite data. For use in bruvo.dist()
functions.
fix_replen()
(m) - Fixes inconsistent repeat lengths. For use in bruvo.dist()
functions.
bruvo.dist()
(m) - Bruvo's distance (see also: fix_replen()
)
diss.dist()
(m) - Absolute genetic distance (see prevosti.dist()
)
nei.dist()
(m | x) - Nei's 1978 genetic distance
rogers.dist()
(m | x) - Rogers' euclidean distance
reynolds.dist()
(m | x) - Reynolds' coancestry distance
edwards.dist()
(m | x) - Edwards' angular distance
prevosti.dist()
(m | x) - Prevosti's absolute genetic distance
bitwise.dist()
(s) - Calculates fast pairwise distances for genlight objects.
aboot()
(m | s | x) - Creates a bootstrapped dendrogram for any distance measure
bruvo.boot()
(m) - Produces dendrograms with bootstrap support based on Bruvo's distance
diversity_boot()
(x) - Generates boostrap distributions of diversity statistics for multilocus genotypes
diversity_ci()
(m | s | x) - Generates confidence intervals for multilocus genotype diversity.
resample.ia()
(m) - Calculates the index of association over subsets of data.
mlg()
(m | s) - Calculates the number of multilocus genotypes
mll()
(m | s) - Displays the current multilocus lineages (genotypes) defined.
nmll()
(m | s) - Same as mlg()
.
mlg.crosspop()
(m | s) - Finds all multilocus genotypes that cross populations
mlg.table()
(m | s) - Returns a table of populations by multilocus genotypes
mlg.vector()
(m | s) - Returns a vector of a numeric multilocus genotype assignment for each individual
mlg.id()
(m | s) - Finds all individuals associated with a single multilocus genotype
mlg.filter()
(m | s) - Collapses MLGs by genetic distance
filter_stats()
(m | s) - Calculates mlg.filter for all algorithms and plots
cutoff_predictor()
(x) - Predicts cutoff threshold from mlg.filter.
mll.custom()
(m | s) - Allows for the custom definition of multilocus lineages
mll.levels()
(m | s) - Allows the user to change levels of custom MLLs.
mll.reset()
(m | s) - Reset multilocus lineages.
diversity_stats()
(x) - Creates a table of diversity indices for multilocus genotypes.
Analysis of multilocus linkage disequilibrium.
ia()
(m) - Calculates the index of association
pair.ia()
(m) - Calculates the index of association for all loci pairs.
win.ia()
(s) - Index of association windows for genlight objects.
samp.ia()
(s) - Index of association on random subsets of loci for genlight objects.
poppr.amova()
(m | s) - Analysis of Molecular Variance (as implemented in ade4)
poppr()
(m | x) - Returns a diversity table by population
poppr.all()
(m | x) - Returns a diversity table by population for all compatible files specified
private_alleles()
(m) - Tabulates the occurrences of alleles that only occur in one population.
locus_table()
(m) - Creates a table of summary statistics per locus.
rrmlg()
(m | x) - Round-robin multilocus genotype estimates.
rraf()
(m) - Round-robin allele frequency estimates.
pgen()
(m) - Probability of genotypes.
psex()
(m) - Probability of observing a genotype more than once.
rare_allele_correction (m) - rules for correcting rare alleles for round-robin estimates.
incomp()
(m) - Check data for incomparable samples.
imsn()
(m | s) - Interactive construction and visualization of minimum spanning networks
plot_poppr_msn()
(m | s | x) - Plots minimum spanning networks produced in poppr with scale bar and legend
greycurve()
(x) - Helper to determine the appropriate parameters for adjusting the grey level for msn functions
bruvo.msn()
(m) - Produces minimum spanning networks based off Bruvo's distance colored by population
poppr.msn()
(m | s | x) - Produces a minimum spanning network for any pairwise distance matrix related to the data
info_table()
(m) - Creates a heatmap representing missing data or observed ploidy
genotype_curve()
(m | x) - Creates a series of boxplots to demonstrate how many markers are needed to represent the diversity of your data.
Aeut()
- (AFLP) Oomycete root rot pathogen Aphanomyces euteiches (Grünwald and Hoheisel, 2006)
monpop()
- (SSR) Peach brown rot pathogen Monilinia fructicola (Everhart and Scherm, 2015)
partial_clone()
- (SSR) partially-clonal data simulated via simuPOP (Peng and Amos, 2008)
Pinf()
- (SSR) Potato late blight pathogen Phytophthora infestans (Goss et. al., 2014)
Pram()
- (SSR) Sudden Oak Death pathogen Phytophthora ramorum (Kamvar et. al., 2015; Goss et. al., 2009)
Zhian N. Kamvar, Jonah C. Brooks, Sydney E. Everhart, Javier F. Tabima, Stacy Krueger-Hadfield, Erik Sotka, Niklaus J. Grünwald
Maintainer: Zhian N. Kamvar
This package relies on the adegenet package. It is built around the genind and genlight object. Genind objects store genetic information in a table of allele frequencies while genlight objects store SNP data efficiently by packing binary allele calls into single bits. Poppr has extended these object into new objects called genclone and snpclone, respectively. These objects are designed for analysis of clonal organisms as they add the @mlg slot for keeping track of multilocus genotypes and multilocus lineages.
Documentation is available for any function by
typing ?function_name
in the R console. Detailed topic explanations
live in the package vignettes:
Vignette | command |
Data import and manipulation | vignette("poppr_manual", "poppr") |
Algorithms and Equations | vignette("algo", "poppr") |
Multilocus Genotype Analysis | vignette("mlg", "poppr") |
Essential functions for importing and manipulating data are detailed within the Data import and manipulation vignette, details on algorithms used in poppr are within the Algorithms and equations vignette, and details for working with multilocus genotypes are in Multilocus Genotype Analysis.
Examples of analyses are available in a primer written by Niklaus J. Grünwald, Zhian N. Kamvar, and Sydney E. Everhart at https://grunwaldlab.github.io/Population_Genetics_in_R/.
If you have a specific question or issue with poppr, feel free to contribute to the google group at https://groups.google.com/d/forum/poppr. If you find a bug and are a github user, you can submit bug reports at https://github.com/grunwaldlab/poppr/issues. Otherwise, leave a message on the groups. Personal emails are highly discouraged as they do not allow others to learn.
--------- Papers announcing poppr ---------
Kamvar ZN, Tabima JF, Grünwald NJ. (2014) Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2:e281 tools:::Rd_expr_doi("10.7717/peerj.281")
Kamvar ZN, Brooks JC and Grünwald NJ (2015) Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front. Genet. 6:208. tools:::Rd_expr_doi("10.3389/fgene.2015.00208")
--------- Papers referencing data sets ---------
Grünwald, NJ and Hoheisel, G.A. 2006. Hierarchical Analysis of Diversity, Selfing, and Genetic Differentiation in Populations of the Oomycete Aphanomyces euteiches. Phytopathology 96:1134-1141 doi: tools:::Rd_expr_doi("10.1094/PHYTO-96-1134")
SE Everhart, H Scherm, (2015) Fine-scale genetic structure of Monilinia fructicola during brown rot epidemics within individual peach tree canopies. Phytopathology 105:542-549 doi: tools:::Rd_expr_doi("10.1094/PHYTO-03-14-0088-R")
Bo Peng and Christopher Amos (2008) Forward-time simulations of nonrandom mating populations using simuPOP. bioinformatics, 24 (11): 1408-1409.
Goss, Erica M., Javier F. Tabima, David EL Cooke, Silvia Restrepo, William E. Fry, Gregory A. Forbes, Valerie J. Fieland, Martha Cardenas, and Niklaus J. Grünwald. (2014) "The Irish potato famine pathogen Phytophthora infestans originated in central Mexico rather than the Andes." Proceedings of the National Academy of Sciences 111:8791-8796. doi: tools:::Rd_expr_doi("10.1073/pnas.1401884111")
Kamvar, Z. N., Larsen, M. M., Kanaskie, A. M., Hansen, E. M., & Grünwald, N. J. (2015). Spatial and temporal analysis of populations of the sudden oak death pathogen in Oregon forests. Phytopathology 105:982-989. doi: tools:::Rd_expr_doi("10.1094/PHYTO-12-14-0350-FI")
Goss, E. M., Larsen, M., Chastagner, G. A., Givens, D. R., and Grünwald, N. J. 2009. Population genetic analysis infers migration pathways of Phytophthora ramorum in US nurseries. PLoS Pathog. 5:e1000583. doi: tools:::Rd_expr_doi("10.1371/journal.ppat.1000583")