Learn R Programming

QTLRel (version 1.14)

genoImpute: Impute Genotypic Data

Description

Impute missing genotypic data in advance intercross lines (AIL).

Usage

genoImpute(gdat, gmap, step, prd = NULL, gr = 2, pos = NULL,
   method = c("Haldane", "Kosambi"), na.str = "NA", msg = FALSE)

Value

A matrix with the number of rows being the same as gdat and with the number of columns depending on the SNP set in both gdat and gmap and the step length.

Arguments

gdat

Genotype data. Should be a matrix or a data frame, with each row representing an observation and each column a marker locus. The column names should be marker names. Genotypes can be 1, 2 and 3, or "AA", "AB" and "BB". Optional if an object prd from genoProb is used as an argument.

gmap

A genetic map. Should be data frame (snp, chr, dist,...), where "snp" is the SNP (marker) name, "chr" is the chromosome where the "snp" is, and "dist" is the genetic distance in centi-Morgan (cM) from the left of the chromosome.

step

Optional. If specified, it is the maximum distance (in cM) between two adjacent loci for which the probabilities are calculated. The distance corresponds to the "cumulative" recombination rate at gr-th generation. If missing, only

prd

An object from genoProb if not NULL. See "details" for more information.

gr

The generation under consideration.

pos

Data frame (chr, dist, snp, ...). If given, step will be ignored.

method

Whether "Haldane" or "Kosambi" mapping function should be used.

na.str

String for missing values.

msg

A logical variable. If TRUE, certain information will be printed out during calculation.

Details

The missing genotypic value is randomly assigned with a probability conditional on the genotypes of the flanking SNPs (makers).

An object, prd, from genoProb alone can be used for the purpose of imputation. Then, the output (especially the putative loci) will be determined by prd. Optionally, it can be used together with gdat so that missing values in gdat will be imputed if possible, depending on whether loci in the columns of gdat can be identified in the third dimension of prd; this won't change the original genotypic data. See examples.

See Also

genoProb

Examples

Run this code
data(miscEx)

# briefly look at genotype data
sum(is.na(gdatF8))
gdatF8[1:5,1:5]

if (FALSE) {
# run 'genoProb'
gdtmp<- gdatF8
   gdtmp<- replace(gdtmp,is.na(gdtmp),0)
prDat<- genoProb(gdat=gdtmp, gmap=gmapF8, gr=8, method="Haldane", msg=TRUE)

# imputation based on 'genoProb' object
tmp<- genoImpute(prd=prDat)
sum(is.na(tmp))
tmp[1:5,1:5]

# imputation based on both genotype data and 'genoProb' object
tmp<- genoImpute(gdatF8, prd=prDat)
sum(is.na(tmp))
tmp[1:5,1:5]

# imputation based on genotype data
tmp<- genoImpute(gdatF8, gmap=gmapF8, gr=8, na.str=NA)
sum(is.na(tmp))
tmp[1:5, 1:5]
# set "msg=TRUE" for more information
tmp<- genoImpute(gdatF8, gmap=gmapF8, gr=8, na.str=NA, msg=TRUE)
sum(is.na(tmp))
tmp[1:5, 1:5]
}

Run the code above in your browser using DataLab