updog-package: `updog` Flexible Genotyping for Polyploids

Description

Implements empirical Bayes approaches to genotype polyploids from next generation sequencing data while accounting for allelic bias, overdispersion, and sequencing error. The main function is flexdog, which allows the specification of many different genotype distributions. An experimental function that takes into account varying levels of relatedness is implemented in mupdog. Also provided are functions to simulate genotypes (rgeno) and read-counts (rflexdog), as well as functions to calculate oracle genotyping error rates (oracle_mis) and correlation with the true genotypes (oracle_cor). These latter two functions are useful for read depth calculations. Run browseVignettes(package = "updog") in R for example usage. The methods are described in detail in Gerard et. al. (2018) and Gerard and Ferr<U+00E3>o (2019).

Arguments

<code>updog</code> Functions

flexdog: The main function that fits an empirical Bayes approach to genotype polyploids from next generation sequencing data.
multidog: A convenience function for running flexdog over many SNPs. This function provides support for parallel computing.
mupdog: An experimental approach to genotype autopolyploids that accounts for varying levels of relatedness between the individuals in the sample.
rgeno: simulate the genotypes of a sample from one of the models allowed in flexdog.
rflexdog: Simulate read-counts from the flexdog model.
plot.flexdog: Plotting the output of flexdog.
plot.mupdog: Plotting the output of mupdog.
oracle_joint: The joint distribution of the true genotype and an oracle estimator.
oracle_plot: Visualize the output of oracle_joint.
oracle_mis: The oracle misclassification error rate (Bayes rate).
oracle_cor: Correlation between the true genotype and the oracle estimated genotype.

<code>updog</code> Datasets

snpdat: A small example dataset for using flexdog.
uitdewilligen: A small example dataset for using mupdog.
mupout: The output from fitting mupdog to uitdewilligen.

Details

The package is named updog for "Using Parental Data for Offspring Genotyping" because we originally developed the method for full-sib populations, but it works now for more general populations.

Our best competitor is probably the fitPoly package, which you can check out at https://cran.r-project.org/package=fitPoly. Though, we think that updog returns better calibrated measures of uncertainty when you have next-generation sequencing data.

If you find a bug or want an enhancement, please submit an issue at http://github.com/dcgerard/updog/issues.

References

Gerard, D., Ferr<U+00E3>o, L. F. V., Garcia, A. A. F., & Stephens, M. (2018). Genotyping Polyploids from Messy Sequencing Data. Genetics, 210(3), 789-807. doi: 10.1534/genetics.118.301468.
Gerard, D. and Ferr<U+00E3>o, L. F. V. (2019). Priors for Genotyping Polyploids. Bioinformatics. doi: 10.1093/bioinformatics/btz852.