Learn R Programming

gtx (version 0.0.8)

read.snpdata.impute: Read genotype dosages in the format output by IMPUTE.

Description

Reads sample information and genotype data from paired sample and genotype files, as generated by the IMPUTE and IMPUTE2 genotype imputation programs and as used by the SNPTEST program. Returns the data in a standard format (see snpdata) that can be used by other functions in this package.

Usage

read.snpdata.impute(samplefile, genofile, phenotypes = NULL)

Arguments

samplefile
filename for samples, assumed 000Ps format.
genofile
filename for genotypes, assumed IMPUTE verbose format.
phenotypes
if not null, a data frame of phenotypes to be merged with the genotypes; first two columns will be used to match against first two columns of the samplefile.

Value

Returns a list with snpinfo and data slots, see snpdata.

Details

The sample file is assumed to have two header lines, of which the first header line is assumed to be column names and the second header line is assumed to be the “000Ps” line. The genotype file is assumed to have one line for each SNP, of which the first five columns contain information about the SNP, followed by triplets of numeric values giving the probabilities of genotypes 0, 1, 2, for each sample in turn.

This function will be slow for large input files. Best to use gtool or grep/awk out the relevant lines (SNPs) into a smaller file first.