convert.snp.text(infile, outfile, bcast = 10000)
Starting with the line five, genetic data are presented. The 5th line contains the data for SNP, which is listed first on the second line. The first column of this line specifies the genotype for the person, who is listed first on the line 1; the second column gives the genotype for the second person, so on. The genotypes are coded as 0 (missing), 1 (for AA), 2 (for AB) and 3 (for BB). Here is a small example:
289982 325286 357273 872422 1005389
SNP-1886933 SNP-2264565 SNP-2305014
1 1 1
825852 2137143 2585920
3 3 3 3 2
3 2 3 3 3
2 2 1 1 1
In this example, we can see that SNP-2305014 (number 3 in the second line) is located on chromosome 1 at the position 2585920. If we would like to know what is genotype of person with ID 325286 (second in the first line), we need to take second column and the third line of the genotypic data. This cell contains 1, thus, person 325286 has genotype "AA" at SNP-2305014.
In the event that you do not want to use a map for some reason (such as prior ordering of the polymorphisms in the genotype file), make a dummy map-line, which contains order information.
The above described genotypic data file is (more or less) human-readable; actually, to achieve the aim of effective data storage GWAA package uses internal format. In this format, four genotypes are stored in single byte; "raw" data format of R is used.
load.gwaa.data
,
convert.snp.illumina
,
convert.snp.ped
,
convert.snp.mach
,
convert.snp.tped
#
# convert.snp.text("genos.dat","genos.raw")
#
Run the code above in your browser using DataLab