Learn R Programming

bnpsd

The bnpsd ("Balding-Nichols Pritchard-Stephens-Donnelly") R package is for simulating admixed populations. More specifically, bnpsd facilitates construction of admixed population structures and simulation of allele frequencies and genotypes from the BN-PSD admixture model. This model combines the Balding-Nichols (BN) allele frequency model for the intermediate subpopulations with the Pritchard-Stephens-Donnelly (PSD) model of individual-specific admixture proportions. This model enables the simulation of complex population structures, ideal for illustrating challenges in kinship coefficient and FST estimation. Note that simulated loci are drawn independently (in linkage equilibrium).

Installation

The stable version of the package is now on CRAN and can be installed using

install.packages("bnpsd")

The current development version can be installed from the GitHub repository using devtools:

install.packages("devtools") # if needed
library(devtools)
install_github('StoreyLab/bnpsd', build_opts = c())

You can see the package vignette, which has more detailed documentation, by typing this into your R session:

vignette('bnpsd')

Example

This is a quick overview of the main bnpsd functions.

Define the population structure (in this case for 1D admixture scenario).

library(bnpsd)
# dimensions of data/model
# number of loci
m_loci <- 10
# number of individuals
n_ind <- 5
# number of intermediate subpops
k_subpops <- 2

# define population structure
# FST values for k=2 subpopulations
inbr_subpops <- c(0.1, 0.3)
# admixture proportions from 1D geography
admix_proportions <- admix_prop_1d_linear(n_ind, k_subpops, sigma = 1)
# also available:
# - admix_prop_1d_circular
# - admix_prop_indep_subpops

# get pop structure parameters of the admixed individuals
# the coancestry matrix
coancestry <- coanc_admix(admix_proportions, inbr_subpops)
# FST of admixed individuals
Fst <- fst(admix_proportions, inbr_subpops)

Draw random allele frequencies and genotypes from this population structure.

# draw all random allele freqs and genotypes
out <- draw_all_admix(admix_proportions, inbr_subpops, m_loci)
 # genotypes
X <- out$X
# ancestral allele frequencies (AFs)
p_anc <- out$p_anc

# OR... draw each vector or matrix separately
# provided for additional flexibility
# ancestral AFs
p_anc <- draw_p_anc(m_loci)
# independent subpops (intermediate) AFs
p_subpops <- draw_p_subpops(p_anc, inbr_subpops)
# individual-specific AFs
p_ind <- make_p_ind_admix(p_subpops, admix_proportions)
# genotypes
X <- draw_genotypes_admix(p_ind)

Examples with a tree for intermediate subpopulations

This tree allows for correlated subpopulations (previous examples had independent subpopulations).

# best to start by specifying tree in Newick string format
tree_str <- '(S1:0.1,(S2:0.1,S3:0.1)N1:0.1)T;'
# and turn it into `phylo` object using the `ape` package
library(ape)
tree_subpops <- read.tree( text = tree_str )
# true coancestry matrix corresponding to this tree
coanc_subpops <- coanc_tree( tree_subpops )

# admixture proportions from 1D geography
# (constructed again but for k=3 tree)
k_subpops <- nrow( coanc_subpops )
admix_proportions <- admix_prop_1d_linear( n_ind, k_subpops, sigma = 0.5 )

# get pop structure parameters of the admixed individuals
# the coancestry matrix
coancestry <- coanc_admix( admix_proportions, coanc_subpops )
# FST of admixed individuals
Fst <- fst_admix( admix_proportions, coanc_subpops )

# draw all random allele freqs and genotypes, tree version
out <- draw_all_admix( admix_proportions, tree_subpops = tree_subpops, m_loci = m_loci )
# genotypes
X <- out$X
# ancestral allele frequencies (AFs)
p_anc <- out$p_anc

# OR... draw tree subpops (intermediate) AFs separately
p_subpops_tree <- draw_p_subpops_tree( p_anc, tree_subpops )

Citations

Alejandro Ochoa, John D Storey. 2021. "Estimating FST and kinship for arbitrary population structures." PLoS Genet 17(1): e1009241. PubMed ID 33465078. doi:10.1371/journal.pgen.1009241. bioRxiv doi:10.1101/083923 2016-10-27.

Alejandro Ochoa, John D Storey. 2016. "FST And Kinship for Arbitrary Population Structures I: Generalized Definitions." bioRxiv doi:10.1101/083915.

Copy Link

Version

Install

install.packages('bnpsd')

Monthly Downloads

362

Version

1.3.13

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

August 25th, 2021

Functions in bnpsd (1.3.13)

admix_prop_1d_circular

Construct admixture proportion matrix for circular 1D geography
draw_p_subpops_tree

Draw allele frequencies for subpopulations related by a tree
coanc_tree

Calculate coancestry matrix corresponding to a tree
draw_p_subpops

Draw allele frequencies for independent subpopulations
coanc_admix

Construct the coancestry matrix of an admixture model
coanc_to_kinship

Transform coancestry matrix to kinship matrix
admix_prop_indep_subpops

Construct admixture proportion matrix for independent subpopulations
scale_tree

Scale a coancestry tree
bnpsd

A package for modeling and simulating an admixed population
fit_tree

Fit a tree structure to a coancestry matrix
fixed_loci

Identify fixed loci
undiff_af

Undifferentiate an allele distribution
tree_reindex_tips

Reindex tree tips in order of appearance in edges
tree_reorder

Reorder tree tips to best match a desired order
tree_additive

Calculate additive edges for a coancestry tree, or viceversa
draw_genotypes_admix

Draw genotypes from the admixture model
make_p_ind_admix

Construct individual-specific allele frequency matrix under the PSD admixture model
fst_admix

Calculate FST for the admixed individuals
draw_p_anc

Draw random Uniform or Beta ancestral allele frequencies
admix_prop_1d_linear

Construct admixture proportion matrix for 1D geography
draw_all_admix

Simulate random allele frequencies and genotypes from the BN-PSD admixture model