kernels: Kernel functions useful for genetic associations

Description

These are the functions that might be used in computing pairwise inter-individual similarities based on their single nucleotide polymorphism (SNP) genotypes.

Usage

am(x)
AM(x)
ibs(x)
IBS(x)
lin0(x)
Lin0(x)
quad1(x)
Quad1(x)
Minkowski(x, p = 1)
minkowski(x, p = 1)
polyk(x,c=0,d=1) 
Polyk(x,c=0,d=1)

Arguments

A numeric matrix encoding genotypes. Each row corresponds to an individual and each column corresponds to a genetic marker. Usually, allele-counting coding is used, but others are allowed.

The exponent defining the Minkowski distance. The same as in stats::dist.

The constant added to cross-products before raising to the power of d.

The exponent defining the polynomial kernel. When c=0 and d=1, this is equivalent to lin0. When c=1 and d=2, this is equivalent to quad1.

Value

The functions starting with an upper-case letter returns an n-by-n symmetric similarity matrix, where n equals nrow(x). The corresponding functions starting with a lower-case letter returns a matrix L such that tcrossprod(L) equals the value from their upper-case counterparts. The number of rows is n, but the number of columns is the rank of the similarity matrix.

Details

These functions compute the pairwise similarities among rows of x. Lower-case versions are more useful in the formula interface to specify random genetic effects. Upper-case versions can be used to directly compute the genetic similarity matrix.

am and AM calculate the allele-matching kernel, and AM is based on SPA3G:::KERNEL.

ibs and IBS compute the identity-by-descent (IBS) kernel. IBS is computed as 1 - as.matrix(dist(x, method='manhattan') * .5 /max(1, ncol(x)) ).

lin0 and Lin0 compute the linear kernel with zero intercept. Lin0 is computed as normalizeTrace(tcrossprod(x)/max(1,ncol(x))).

quad1 and Quad1 compute the quadratic kernel with offset 1. Qaud1 is computed as normalizeTrace((base::tcrossprod(x)+1)^2).

minkowski and Minkowski compute the similarity based on the Minkowski distance. Minkowski is computed as 1-as.matrix(dist(x, method='minkowski', p=p)) * .5 / max(1, ncol(x))^(1/p) .

Examples

Run this code

# NOT RUN {
set.seed(2345432L)
x=matrix(sample(2, 50L, replace=TRUE), 10L)  
IBS(x)
range(tcrossprod(ibs(x)) - IBS(x)  )

AM(x)
range(tcrossprod(am(x)) - AM(x)  )

Lin0(x)
range(tcrossprod(lin0(x)) - Lin0(x)  )
range(Lin0(x) - Polyk(x, 0, 1))

Quad1(x)
range(tcrossprod(quad1(x)) - Quad1(x)  )
range(Quad1(x) - Polyk(x, 1, 2))

Minkowski(x)
range(tcrossprod(minkowski(x)) - Minkowski(x)  )
range(tcrossprod(minkowski(x)) - IBS(x)  )

## Use in formulas
model.matrix(~0+ibs(x))
range(tcrossprod(model.matrix(~0+ibs(x))) - IBS(x))

# }

Run the code above in your browser using DataLab