The allelic counts, sample sizes, geographic distances, ecological distances, and population metadata from the 38 human populations used in example BEDASSLE analyses, subsetted from the Human Genome Diversity Panel (HGDP) dataset.
data(HGDP.bedassle.data)
The format is: List of 7
int [1:38, 1:1000] 12 16 5 17 4 14 20 5 34 ...
..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:38] "Adygei" "Basque" "Italian" "French" ... .. ..$ : chr [1:1000] "rs13287637" "rs17792496" "rs1968588" ...
int [1:38, 1:1000] 34 48 24 56 30 50 56 ...
..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:38] "Adygei" "Basque" "Italian" "French" ... .. ..$ : chr [1:1000] "rs13287637" "rs17792496" "rs1968588" ...
num [1:38, 1:38] 0 1.187 0.867 1.101 1.247 ...
num [1:38, 1:38] 0 0 0 0 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:38] "1" "2" "3" "4" ... .. ..$ : chr [1:38] "1" "2" "3" "4" ...
int 38
int 1000
'data.frame': 38 obs. of 3 variables:
A matrix of allelic count data, for which nrow =
the number of populations and ncol =
the number of bi-allelic loci
sampled. Each cell gives the number of times allele `1' is observed in each
population. The choice of which allele is allele `1' is arbitrary, but must
be consistent across all populations at a locus.
A matrix of sample sizes, for which nrow =
the number
of populations and ncol =
the number of bi-allelic loci sampled
(i.e. - the dimensions of sample.sizes
must match those of
counts
). Each cell gives the number of chromosomes successfully
genotyped at each locus in each population.
Pairwise geographic distance (\(D_{i,j}\)). This may be Euclidean, or, if the geographic scale of sampling merits it, great-circle distance. In the case of this dataset, it is great-circle distance.
Pairwise ecological distance(s) (\(E_{i,j}\)), which may be continuous (e.g. - difference in elevation) or binary (same or opposite side of some hypothesized barrier to gene flow). In this case, the ecological distance is binary, representing whether a pair of populations occurs on the same side, or on opposite sides, of the Himalayas.
The number of populations in the analysis.
This should be equal to nrow(
counts)
. In this dataset, there
are 38 populations sampled.
The number of loci in the analysis. This should be equal
to ncol(
counts)
. In this dataset, there are 1000 loci
sampled.
This data frame contains the metadata on the populations included in the analysis, including:
Population name
Latitude
Longitude
Bradburd, G.S., Ralph, P.L., and Coop, G.M. Disentangling the effects of geographic and ecological isolation on genetic differentiation. Evolution 2013.
## see \command{MCMC}, \command{MCMC_BB}, \command{calculate.pariwise.Fst},
## \command{calculate.all.pairwise.Fst}, and \command{Covariance} for usage.
Run the code above in your browser using DataLab