The input file should be a space- or tab-delimited ASCII text file. The first line is a 0 / 1 indicator. ‘0’ indicates that the data matrix for each locus is a populations x alleles matrix; ‘1’ indicates that the data matrix for each locus is an alleles x populations matrix. The second line contains the number of populations. The third line contains the number of loci. Then, the data for each locus consists in the number of alleles at that locus, followed by the data matrix at that locus, with each row corresponding to the same allele (if the indicator variable is 1) or to the same population (if the indicator variable is 0). For dominant data, the data consists in the number of genotypes, not the number of alleles. It is important to note that the frequency of the homozygote individuals for the recessive allele appear first in either the rows or columns of the data matrix. In the following example, the data consists in 2 populations and 2 loci, with 5 alleles at the first locus and 8 alleles at the second locus.
0
2
2
5
1 0 4 10 5
0 1 13 0 6
8
3 1 1 0 0 0 1 14
6 0 2 1 2 5 2 2
Spaces and blank lines can be included as desired.
For dominant data, it is important to note that the frequency of the homozygote individuals for the recessive allele appears first in either the rows or columns of the data matrix.
The command line read.data
creates a file named ‘infile.dat’, a file named ‘sample_sizes.dat’ and a set of files named ‘plot_i_j.dat’ where \(i\) and \(j\) correspond to population numbers, so that each file ‘plot_i_j.dat’ corresponds to the pairwise analysis of populations \(i\) and j. In the file infile.dat, each line corresponds to the pairwise analysis of populations \(i\) and \(j\). Each line contains (in that order): the name of the output simulation file, the numbers \(i\) and \(j\), the multi-locus estimates of \(F_1\) and \(F_2\), and Weir and Cockerham's (1984) estimate of \(F_{ST}\). The file sample_sizes.dat contains sample sizes information, for internal use only. In the files ‘plot_i_j.dat’, each line corresponds to one locus observed in the data set. Each line contains (in that order): the locus-specific estimates of \(F_1\) and \(F_2\), Weir and Cockerham's (1984) estimate of \(F_{ST}\), Nei's heterozygosity (\(H_e\)), the number of alleles at that locus in the pooled sample, and the rank of the locus in the data set.