Learn R Programming

bnpsd (version 1.1.1)

draw_genotypes_admix: Draw genotypes from the admixture model

Description

Given the Individual-specific Allele Frequency (IAF) \(\pi_{ij}\) for locus \(i\) and individual \(j\), genotypes are drawn binomially: $$x_{ij}|\pi_{ij} \sim \mbox{Binomial}(2, \pi_{ij}).$$ Below \(m\) is the number of loci, \(n\) the number of individuals, and \(k\) the number of intermediate subpopulations. If an admixture proportion matrix \(Q\) is provided as the second argument, the first argument \(P\) is treated as the intermediate subpopulation allele frequency matrix and the IAF matrix is given by $$P Q^T.$$ If \(Q\) is missing, then \(P\) is treated as the IAF matrix.

Usage

draw_genotypes_admix(p_ind, admix_proportions = NULL, low_mem = FALSE)

Arguments

p_ind

The \(m \times n\) IAF matrix (if admix_proportions is missing) or the \(m \times k\) intermediate subpopulation allele frequency matrix (if admix_proportions is present)

admix_proportions

The optional \(n \times k\) admixture proportion matrix

low_mem

If TRUE, the low-memory algorithm is used (admix_proportions must be present)

Value

The \(m \times n\) genotype matrix

Details

To reduce memory, set low_mem = TRUE to draw genotypes one locus at the time from \(P\) and \(Q\) (both must be present). This low-memory algorithm prevents the construction of the entire IAF matrix, but is considerably slower than the standard algorithm.

Examples

Run this code
# NOT RUN {
# dimensions
# number of loci
m_loci <- 10
# number of individuals
n_ind <- 5
# number of intermediate subpops
k_subpops <- 2

# define population structure
# FST values for k = 2 subpops
inbr_subpops <- c(0.1, 0.3)
# non-trivial admixture proportions
admix_proportions <- admix_prop_1d_linear(n_ind, k_subpops, sigma = 1)

# draw allele frequencies
# vector of ancestral allele frequencies
p_anc <- draw_p_anc(m_loci)

# matrix of intermediate subpop allele freqs
p_subpops <- draw_p_subpops(p_anc, inbr_subpops)

# matrix of individual-specific allele frequencies
p_ind <- make_p_ind_admix(p_subpops, admix_proportions)

# draw genotypes from intermediate subpops (one individual each)
X_subpops <- draw_genotypes_admix(p_subpops)

# and genotypes for admixed individuals
X_ind <- draw_genotypes_admix(p_ind)

# draw genotypes for admixed individuals without p_ind intermediate
# (p_ind is computer internally and discarded when done)
X_ind <- draw_genotypes_admix(p_subpops, admix_proportions)

# use low-memory version (p_ind is computed by row, never fully in memory)
X_ind <- draw_genotypes_admix(p_subpops, admix_proportions, low_mem = TRUE)

# }

Run the code above in your browser using DataLab