The function glSim simulates simple SNP data with the
  possibility of contrasted structures between two groups
  as well as background ancestral population structure. 
  Returned objects are instances of the class genlight.
glSim(n.ind, n.snp.nonstruc, n.snp.struc = 0, grp.size = c(0.5, 0.5), k = NULL,
                    pop.freq = NULL, ploidy = 1, alpha = 0, parallel = FALSE,
                    LD = TRUE, block.minsize = 10, block.maxsize = 1000, theta = NULL,
                    sort.pop = FALSE, ...)A genlight object.
an integer indicating the number of individuals to be simulated.
an integer indicating the number of non-structured SNPs to be simulated; for these SNPs, all individuals are drawn from the same binomial distribution.
an integer indicating the number of structured SNPs to be simulated; for these SNPs, different binomial distributions are used for the two simulated groups; frequencies of the derived alleles in groups A and B are built to differ (see details).
a vector of length 2 specifying the proportions of the two phenotypic groups (must sum to 1). By default, both groups have the same size.
an integer specifying the number of ancestral populations to be generated.
a vector of length k specifying the proportions of the
  k ancestral populations (must sum to 1). If, as by default, pop.freq 
  is null, and k is non-null, pop.freq will be the result of
  random sampling into k population groups.
an integer indicating the ploidy of the simulated genotypes.
asymmetry parameter: a numeric value between 0 and 0.5, used to enforce allelic differences between the groups. Differences between groups are strongest when alpha = 0.5 and weakest when alpha = 0 (see details).
a logical indicating whether multiple cores should be used in generating the simulated data (TRUE). This option can reduce the amount of computational time required to simulate the data, but is not supported on Windows.
a logical indicating whether loci should be displaying linkage disequilibrium (TRUE) or be generated independently (FALSE, default). When set to TRUE, data are generated by blocks of correlated SNPs (see details).
an optional integer indicating the minimum number of 
    SNPs to be handled at a time during the simulation of linked SNPs (when 
    LD=TRUE. Increasing the minimum block size will increase 
    the RAM requirement but decrease the amount of computational time 
    required to simulate the genotypes.
an optional integer indicating the maximum number of SNPs to be handled at a time during the simulation of linked SNPs. Note: if LD blocks of equal size are desired, set block.minsize = block.maxsize.
an optional numeric value between 0 and 0.5 specifying the extent to which linkage should be diluted. Linkage is strongest when theta = 0 and weakest when theta = 0.5.
a logical specifying whether individuals should be ordered by
  ancestral population (sort.pop=TRUE) or phenotypic population 
  (sort.pop=FALSE).
arguments to be passed to the genlight constructor.
Caitlin Collins caitlin.collins12@imperial.ac.uk, Thibaut Jombart t.jombart@imperial.ac.uk
=== Allele frequencies in contrasted groups ===
When n.snp.struc is greater than 0, some SNPs are simulated in
  order to differ between groups (noted 'A' and 'B'). Different patterns 
  between groups are achieved by using different
  frequencies of the second allele for A and B, denoted \(p_A\) and
  \(p_B\). For a given SNP, \(p_A\) is drawn from a uniform
  distribution between 0 and (0.5 - alpha). \(p_B\) is then computed
  as 1 - \(p_A\). Therefore, differences between groups are mild for
  alpha=0, and total for alpha = 0.5.
=== Linked or independent loci ===
Independent loci (LD=FALSE) are simulated using the standard
  binomial distribution, with randomly generated allele
  frequencies. Linked loci (LD=FALSE) are trickier towe need to
  simulate discrete variables with pre-defined correlation structure.
Here, we first generate deviates from multivariate normal distributions with randomly generated correlation structures. These variables are then discretized using the quantiles of the distribution. Further improvement of the procedure will aim at i) specifying the strength of the correlations between blocks of alleles and ii) enforce contrasted structures between groups.
- genlight: class of object for storing massive binary
  SNP data.
- glPlot: plotting genlight objects.
- glPca: PCA for genlight objects.
if (FALSE) {
## no structure
x <- glSim(100, 1e3, ploid=2)
plot(x)
## 1,000 non structured SNPs, 100 structured SNPs
x <- glSim(100, 1e3, n.snp.struc=100, ploid=2)
plot(x)
## 1,000 non structured SNPs, 100 structured SNPs, ploidy=4
x <- glSim(100, 1e3, n.snp.struc=100, ploid=4)
plot(x)
## same thing, stronger differences between groups
x <- glSim(100, 1e3, n.snp.struc=100, ploid=2, alpha=0.4)
plot(x)
##  same thing, loci with LD structures
x <- glSim(100, 1e3, n.snp.struc=100, ploid=2, alpha=0.4, LD=TRUE, block.minsize=100)
plot(x)
}
Run the code above in your browser using DataLab