Learn R Programming

bnpsd (version 1.1.1)

draw_all_admix: Simulate random allele frequencies and genotypes from the BN-PSD admixture model

Description

This function returns simulated ancestral, intermediate, and individual-specific allele frequencies and genotypes given the admixture structure, as determined by the admixture proportions and the vector of intermediate subpopulation \(F_{ST}\) values. The function is a wrapper around draw_p_anc, draw_p_subpops, make_p_ind_admix, and draw_genotypes_admix with additional features such as requiring polymorphic loci. Importantly, by default fixed loci are re-drawn from the start (starting from the ancestral allele frequencies) so no fixed loci are in the output and no biases are introduced by re-drawing genotypes conditional on any of the previous allele frequencies (ancestral, intermediate, or individual-specific). Below \(m\) is the number of loci, \(n\) is the number of individuals, and \(k\) is the number of intermediate subpopulations.

Usage

draw_all_admix(admix_proportions, inbr_subpops, m_loci,
  want_genotypes = TRUE, want_p_ind = FALSE, want_p_subpops = FALSE,
  want_p_anc = TRUE, low_mem = FALSE, verbose = FALSE,
  require_polymorphic_loci = TRUE)

Arguments

admix_proportions

The \(n \times k\) matrix of admixture proportions.

inbr_subpops

The length-\(k\) vector (or scalar) of intermediate subpopulation \(F_{ST}\) values.

m_loci

The number of loci to draw.

want_genotypes

If TRUE (default), includes the matrix of random genotypes in the return list.

want_p_ind

If TRUE (NOT default), includes the matrix of individual-specific allele frequencies in the return list.

want_p_subpops

If TRUE (NOT default), includes the matrix of random intermediate subpopulation allele frequencies in the return list.

want_p_anc

If TRUE (default), includes the matrix of random ancestral allele frequencies in the return list.

low_mem

If TRUE, uses a low-memory algorithm to raw genotypes without storing or returning the corresponding `p_ind` matrix.

verbose

If TRUE, prints messages for every stage in the algorithm.

require_polymorphic_loci

If TRUE (default), returned genotype matrix will not include any fixed loci (loci that happened to be fixed are drawn again, starting from their ancestral allele frequencies, and checked iteratively until no fixed loci remain, so that the final number of polymorphic loci is exactly \(m_loci\)).

Value

A named list that includes the following randomly-generated data in this order:

X:

An \(m \times n\) matrix of genotypes. Included if want_genotypes = TRUE.

p_anc:

A length-\(m\) vector of ancestral allele frequencies. Included if want_p_anc = TRUE.

p_subpops:

An \(m \times k\) matrix of intermediate subpopulation allele frequencies Included if want_p_subpops = TRUE.

p_ind:

An \(m \times n\) matrix of individual-specific allele frequencies. Included only if both want_p_ind = TRUE and low_mem = FALSE.

Examples

Run this code
# NOT RUN {
# dimensions
# number of loci
m_loci <- 10
# number of individuals
n_ind <- 5
# number of intermediate subpops
k_subpops <- 2

# define population structure
# FST values for k = 2 subpopulations
inbr_subpops <- c(0.1, 0.3)
# admixture proportions from 1D geography
admix_proportions <- admix_prop_1d_linear(n_ind, k_subpops, sigma = 1)

# draw all random allele freqs and genotypes
out <- draw_all_admix(admix_proportions, inbr_subpops, m_loci)

# return value is a list with these items:

# genotypes
X <- out$X

# ancestral AFs
p_anc <- out$p_anc

# # these are excluded by default, but would be included if ...
# # ... `want_p_subpops == TRUE`
# # intermediate subpopulation AFs
# p_subpops <- out$p_subpops
# 
# # ... `want_p_ind == TRUE` and `low_mem = FALSE`
# # individual-specific AFs
# p_ind <- out$p_ind

# }

Run the code above in your browser using DataLab