Gives the column numbers of SNP identifiers or SNP numbers in a standard ped file, calculated from the SNP's positions in the corresponding map file. The column numbers are sorted in the same order as snp.select
.
These positions may be useful when extracting a selection of SNPs from a ped file.
snpPos(snp.select, map.file, blank.lines.skip = TRUE)
A vector of the column numbers of the SNP identifiers/SNP numbers in the ped file, sorted in the same order as given in snp.select
.
A character vector of the SNP identifiers (RS codes) or a numeric vector of the SNP numbers.
A character string giving the name and path of the standard map file to be used. See Details for a description of the standard map format.
Logical. If "TRUE" (default), snpPos
ignores blank lines in map.file
.
Miriam Gjerdevik,
with Hakon K. Gjessing
Professor of Biostatistics
Division of Epidemiology
Norwegian Institute of Public Health
hakon.gjessing@uib.no
To extract certain SNPs from a standard ped file, one has to know their positions in the ped file.
This can be obtained from the corresponding map file.
The map file should look something like this:
Chromosome SNP-identifier Base-pair-position
1 RS9629043 554636
1 RS12565286 711153
1 RS12138618 740098
Alternatively, the map file could contain four columns. The column values should then be:
Chromosome, SNP-identifier, Genetic-distance, Base-pair-position.
A header must be added to the map file if this does not already exist.
The format of the corresponding ped file should be something like this:
1104 1104-1 1104-2 1104-3 1 2 4 1 3 2
1104 1104-2 0 0 1 1 4 1 2 2
1104 1104-3 0 0 2 1 0 0 0 0
1105 1105-1 1105-2 1105-3 2 2 1 1 2 2
1105 1105-2 0 0 1 1 1 1 2 2
1105 1105-3 0 0 2 1 1 1 3 2
The column values are: Family id, Individual id, Father's id, Mother's id, Sex (1 = male, 2 = female, alternatively: 1 = male, 0 = female), and Case-control status (1 = controls, 2 = cases, alternatively: 0 = controls, 1 = cases).
Column 7 and onwards contain the genotype data, with alleles in separate columns. A ``0'' is used to denote missing data.
Web Site: https://haplin.bitbucket.io
convertPed
, lineByLine
if (FALSE) {
# Find the column numbers of the SNP identifiers "RS9629043" and "RS12565286" in
# a standard ped file
snpPos(snp.select = c("RS9629043", "RS12565286"), map.file = "mygwas.map")
}
Run the code above in your browser using DataLab