Learn R Programming

synbreed (version 0.12-9)

kin: Relatedness based on pedigree or marker data

Description

This function implements different measures of relatedness between individuals in an object of class gpData: (1) Expected relatedness based on pedigree and (2) realized relatedness based on marker data. See 'Details'. The function uses as first argument an object of class gpData. An argument ret controls the type of relatedness coefficient.

Usage

kin(gpData, ret=c("add","kin","dom","gam","realized","realizedAB",
                  "sm","sm-smin","gaussian"),
            DH=NULL, maf=NULL, selfing=NULL, lambda=1, P=NULL, cores=1)

Arguments

gpData

object of class gpData

ret

character. The type of relationship matrix to be returned. See 'Details'.

DH

logical vector of length \(n\). TRUE or 1 if individual is a doubled-haploid (DH) line and FALSE or 0 otherwise. This option is only used, if ret argument is "add" or "kin".

maf

numeric vector of length equal the number of markers. Supply values for the \(p_i\) of each marker, which were used to correct the allele counts in ret="realized" and ret="realizedAB". If not specified, \(p_i\) equals the minor allele frequency of each locus.

selfing

numeric vector of length \(n\). It is used as the number of selfings of an recombinant inbred line individual. Be awere, that this should only be used for single seed descendants This option is only used, if ret argument is "add" or "kin".

lambda

numeric bandwidth parameter for the gaussian kernel. Only used for calculating the gaussian kernel.

P

numeric matrix of the same dimension as geno of the gpData object. This option can be used for own allelefrequencies of different groups in the genotypes.

cores

numeric. Here you can specify the number of cores you like to use.

Value

An object of class "relationshipMatrix".

Details

Pedigree based relatedness (return arguments "add", "kin", "dom", and "gam")

Function kin provides different types of measures for pedigree based relatedness. An element pedigree must be available in the object of class gpData. In all cases, the first step is to build the gametic relationship. The gametic relationship is of order 2\(n\) as each individual has two alleles (e.g. individual \(A\) has alleles \(A1\) and \(A2\)). The gametic relationship is defined as the matrix of probabilities that two alleles are identical by descent (IBD). Note that the diagonal elements of the gametic relationship matrix are 1. The off-diagonals of individuals with unknown or unrelated parents in the pedigree are 0. If ret="gam" is specified, the gametic relationship matrix constructed by pedigree is returned.

The gametic relationship matrix can be used to construct other types of relationship matrices. If ret="add", the additive numerator relationship matrix is returned. The additive relationship of individuals A (alleles \(A1,A2\)) and B (alleles \(B1,B2\)) is given by the entries of the gametic relationship matrix $$0.5\cdot \left[(A1,B1) + (A1,B2) + (A2,B1) + (A2,B2)\right],$$ where \((A1,B1)\) denotes the element [A1,B1] in the gametic relationship matrix. If ret="kin", the kinship matrix is returned which is half of the additive relationship matrix.

If ret="dom", the dominance relationship matrix is returned. The dominance relationship matrix between individuals A (\(A1,A2\)) and B (\(B1,B2\)) in case of no inbreeding is given by $$\left[(A1,B1) \cdot (A2,B2) + (A1,B2) \cdot (A2,B1)\right],$$ where \((A1,C1)\) denotes the element [A1,C1] in the gametic relationship matrix.

Marker based relatedness (return arguments "realized","realizedAB", "sm", and "sm-smin")

Function kin provides different types of measures for marker based relatedness. An element geno must be available in the object of class gpData. Furthermore, genotypes must be coded by the number of copies of the minor allele, i.e. function codeGeno must be applied in advance.

If ret="realized", the realized relatedness between individuals is computed according to the formulas in Habier et al. (2007) or vanRaden (2008) $$U = \frac{ZZ'}{2\sum p_i(1-p_i)}$$ where \(Z=W-P\), \(W\) is the marker matrix, \(P\) contains the allele frequencies multiplied by 2, \(p_i\) is the allele frequency of marker \(i\), and the sum is over all loci.

If ret="realizedAB", the realized relatedness between individuals is computed according to the formula in Astle and Balding (2009) $$U = \frac{1}{M} \sum \frac{(w_i-2p_i)(w_i-2p_i)'}{2p_i(1-p_i)}$$ where \(w_i\) is the marker genotype, \(p_i\) is the allele frequency at marker locus \(i\), and \(M\) is the number of marker loci, and the sum is over all loci.

If ret="sm", the realized relatedness between individuals is computed according to the simple matching coefficient (Reif et al. 2005). The simple matching coefficient counts the number of shared alleles across loci. It can only be applied to homozygous inbred lines, i.e. only genotypes 0 and 2. To account for loci that are alike in state but not identical by descent (IBD), Hayes and Goddard (2008) correct the simple matching coefficient by the minimum of observed simple matching coefficients $$\frac{s-s_{min}}{1-s_{min}}$$ where \(s\) is the matrix of simple matching coefficients. This formula is used with argument ret="sm-smin".

If ret="gaussian", the euklidian distances distEuk for all individuals are calculated. The values of distEuk are than used to calculate similarity coefficients between the individuals with exp(distEuk^2/numMarker). Be aware that this relationship matrix scales theoretically between 0 and 1!

References

Habier D, Fernando R, Dekkers J (2007). The Impact of Genetic Relationship information on Genome-Assisted Breeding Values. Genetics, 177, 2389 -- 2397.

vanRaden, P. (2008). Efficient methods to compute genomic predictions. Journal of Dairy Science, 91:4414 -- 4423.

Astle, W., and D.J. Balding (2009). Population Structure and Cryptic Relatedness in Genetic Association Studies. Statistical Science, 24(4), 451 -- 471.

Reif, J.C.; Melchinger, A. E. and Frisch, M. Genetical and mathematical properties of similarity and dissimilarity coefficients applied in plant breeding and seed bank management. Crop Science, January-February 2005, vol. 45, no. 1, p. 1-7.

Rogers, J., 1972 Measures of genetic similarity and genetic distance. In Studies in genetics VII, volume 7213. Univ. of Texas, Austin

Hayes, B. J., and M. E. Goddard. 2008. Technical note: Prediction of breeding values using marker derived relationship matrices. J. Anim. Sci. 86

See Also

plot.relationshipMatrix

Examples

Run this code
# NOT RUN {
#=========================
# (1) pedigree based relatedness
#=========================
# }
# NOT RUN {
library(synbreedData)
data(maize)
K <- kin(maize,ret="kin")
plot(K)
# }
# NOT RUN {
#=========================
# (2) marker based relatedness
#=========================
# }
# NOT RUN {
data(maize)
U <- kin(codeGeno(maize),ret="realized")
plot(U)
# }
# NOT RUN {

### Example for Legarra et al. (2009), J. Dairy Sci. 92: p. 4660
id <- 1:17
par1 <- c(0,0,0,0,0,0,0,0,1,3,5,7,9,11,4,13,13)
par2 <- c(0,0,0,0,0,0,0,0,2,4,6,8,10,12,11,15,14)
ped <- create.pedigree(id,par1,par2)
gp <- create.gpData(pedigree=ped)

# additive relationship
A <- kin(gp,ret="add")
# dominance relationship
D <- kin(gp,ret="dom")
# }

Run the code above in your browser using DataLab