Learn R Programming

synbreed (version 0.12-9)

pairwiseLD: Pairwise LD between markers

Description

Estimate pairwise Linkage Disequilibrium (LD) between markers measured as \(r^2\) using an object of class gpData. For the general case, a gateway to the software PLINK (Purcell et al. 2007) is established to estimate the LD. A within-R solution is only available for marker data with only 2 genotypes, i.e. homozgous inbred lines. Return value is an object of class LDdf which is a data.frame with one row per marker pair or an object of class LDMat which is a matrix with all marker pairs. Additionally, the euclidian distance between position of markers is computed and returned.

Usage

pairwiseLD(gpData, chr = NULL, type = c("data.frame", "matrix"),use.plink=FALSE,
           ld.threshold=0, ld.window=99999, rm.unmapped = TRUE, cores=1)

Arguments

gpData

object of class gpData with elements geno and map

chr

numeric scalar or vector. Return value is a list with pairwise LD of all markers for each chromosome in chr.

type

character. Specifies the type of return value (see 'Value').

use.plink

logical. Should the software PLINK be used for the computation?

ld.threshold

numeric. Threshold for the LD to thin the output. Only pairwise LD>ld.threshold is reported when PLINK is used. This argument can only be used for type="data.frame".

ld.window

numeric. Window size for pairwise differences which will be reported by PLINK (only for use.plink=TRUE; argument --ld-window-kb in PLINK) to thin the output dimensions. Only SNP pairs with a distance < ld.window are reported (default = 99999).

rm.unmapped

logical. Remove markers with unknown postion in map before using PLINK?

cores

numeric. Here you can specify the number of cores you like to use.

Value

For type="data.frame" an object of class LDdf with one element for each chromosome is returned. Each element is a data.frame with columns marker1, marker2, r2 and distance for all \(p(p-1)/2\) marker pairs (or thinned, see 'Details').

For type="matrix" an object of class LDmat with one element for each chromosome is returned. Each element is a list of 2: a \(p \times p\) matrix with pairwise LD and the corresponding \(p \times p\) matrix with pairwise distances.

Details

The function write.plink is called to prepare the input files and the script for PLINK. The executive PLINK file plink.exe must be available (e.g. in the working directory or through path variables). The function pairwiseLD calls PLINK and reads the results. The evaluation is performed separately for every chromosome. The measure for LD is \(r^2\). This is defined as $$D= p_{AB} - p_Ap_B $$ and $$r^2=\frac{D^2}{p_Ap_Bp_ap_b}$$ where \(p_{AB}\) is defined as the observed frequency of haplotype \(AB\), \(p_A=1-p_a\) and \(p_B=1-p_b\) the observed frequencies of alleles \(A\) and \(B\). If the number of markers is high, a threshold for the LD can be used to thin the output. In this case, only pairwise LD above the threshold is reported (argument --ld-window-r2 in PLINK).

Default PLINK options used --no-parents --no-sex --no-pheno --allow-no-sex --ld-window p --ld-window-kb 99999

References

Hill WG, Robertson A (1968). Linkage Disequilibrium in Finite Populations. Theoretical and Applied Genetics, 6(38), 226 - 231.

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.

See Also

LDDist, LDMap

Examples

Run this code
# NOT RUN {
library(synbreedData)
data(maize)
maizeC <- codeGeno(maize)
maizeLD <- pairwiseLD(maizeC,chr=1,type="data.frame")
# }

Run the code above in your browser using DataLab