Learn R Programming

bWGR (version 2.2.10)

WGR2 (EM): Expectation-Maximization WGR

Description

Univariate models to find breeding values through regression fitted via expectation-maximization implemented in C++.

Usage

emRR(y, gen, df = 10, R2 = 0.5)
emBA(y, gen, df = 10, R2 = 0.5)
emBB(y, gen, df = 10, R2 = 0.5, Pi = 0.75)
emBC(y, gen, df = 10, R2 = 0.5, Pi = 0.75)
emBCpi(y, gen, df = 10, R2 = 0.5, Pi = 0.75)
emBL(y, gen, R2 = 0.5, alpha = 0.02)
emEN(y, gen, R2 = 0.5, alpha = 0.02)
emDE(y, gen, R2 = 0.5)
emML(y, gen, D = NULL)
lasso(y, gen)

emCV(y, gen, k = 5, n = 5, Pi = 0.75, alpha = 0.02, df = 10, R2 = 0.5, avg=TRUE, llo=NULL, tbv=NULL, ReturnGebv = FALSE)

Value

The EM functions returns a list with the intercept (\(mu\)), the regression coefficient (\(b\)), the fitted value (\(hat\)), and the estimated intraclass-correlation (\(h2\)).

The function emCV returns the predictive ability of each model, that is, the correlation between the predicted and observed values from \(k\)-fold cross-validations repeated \(n\) times.

Arguments

y

Numeric vector of response variable (\(n\)). NA is not allowed.

gen

Numeric matrix containing the genotypic data. A matrix with \(n\) rows of observations and \(m\) columns of molecular markers.

df

Hyperprior degrees of freedom of variance components.

R2

Expected R2, used to calculate the prior shape (de los Campos et al. 2013).

Pi

Value between 0 and 1. Expected probability pi of having null effect (or 1-Pi if Pi>0.5).

alpha

Value between 0 and 1. Intensity of L1 variable selection.

D

NULL or numeric vector with length p. Vector of weights for markers.

k

Integer. Folding of a k-fold cross-validation.

n

Integer. Number of cross-validation to perform.

avg

Logical. Return average across CV, or correlations within CV.

llo

NULL or a vector (numeric or factor) with the same length as y. If provided, the cross-validations are performed as Leave a Level Out (LLO). This argument allows the user to predefine the splits. This argument overrides k and n.

tbv

NULL or numeric vector of 'true breeding values' (\(n\)) to use to compare cross-validations to. If NULL, the cross-validations will have the phenotypes as prediction target.

ReturnGebv

Logical. If TRUE, it returns a list with the average marker values and fitted values across all cross-validations, in addition to the regular output.

Author

Alencar Xavier

Details

The model for the whole-genome regression is as follows:

$$y = mu + Xb + e$$

where \(y\) is the response variable, \(mu\) is the intercept, \(X\) is the genotypic matrix, \(b\) is the effect of an allele substitution (or regression coefficient) and \(e\) is the residual term. A k-fold cross-validation for model evaluation is provided by \(emCV\).

Examples

Run this code
     if (FALSE) {

data(tpod)
emCV(y,gen,3,3)
          
 }

Run the code above in your browser using DataLab