kspm: Fitting Kernel Semi Parametric model

Description

kspm is used to fit kernel semi parametric models.

Usage

kspm(response, linear = NULL, kernel = NULL, data = NULL,
  level = 1, control = kspmControl())

Arguments

response

a character with the name of the response variable or a vector containing the outcome or a matrix with outcome in the first column.

linear

an optional object of class "formula": a symbolic description of the linear part of the model to be fitted or a vector or a matrix containing covariates included in the linear part of the model. Default is intercept only. The details of model specification are given under <U+2018>Details<U+2019>.

kernel

an object of class "formula": a symbolic description of the kernel part of the model to be fitted. If missing a linear model is fitted using lm function. The details of model specification are given under <U+2018>Details<U+2019>.

data

an optional data frame containing the variables in the model. If NULL (default), data are taken from the workspace.

level

printed information about the model (0: no information, 1: information about kernels included in the model (default))

control

see kspmControl.

Value

kspm returns an object of class kspm.

An object of class kspm is a list containing the following components:

linear.coefficients

matrix of coefficients associated with linear part, the number of coefficients is the number of terms included in linear part

kernel.coefficients

matrix of coefficients associated with kernel part, the number of rows is the sample size included in the analysis and the number of columns is the number of kernels included in the model

lambda

penalization parameter(s)

fitted.values

the fitted mean values

residuals

the residuals, that is response minus the fitted values

sigma

standard deviation of residuals

vector of responses

design matrix for linear part

kernel matrices computed by the model

n.total

total sample size

sample size of the model (model is performed on complete data only)

edf

effective degree of freedom

linear.formula

formula corresponding to the linear part of the model

kernel.info

information about kernels included in the model such as matrices of covariates (Z), kernel function (type), values of hyperparameters (rho, gamma, d). A boolean indicates if covariates were scaled (kernel.scale) and if TRUE, kernel.mean, kernel.sd and Z.scale give information about scaling. kernel.formula indicates the formula of the kernel and free.parameters indicates the hyperparameters that were estimated by the model.

Hat

The hat matrix \(H\) such that \(\hat{Y} = HY\)

A matrix corresponding to \(I - \sum\limits_{\ell = 1}^L K_{\ell} G_{\ell}^{-1} M_{\ell}\) according to our notations

XLX_inv

A matrix corresponding to \((XLX)^{-1}\)

GinvM

A list of matrix, each corresponding to a kernel and equaling \(G_{\ell}^{-1}M_{\ell}\) according to our notations

control

List of control parameters

Details

The kernel semi parametric model refers to the following equation \(Y_i = X_i\beta + h(Z_i) + e_i\) with \(i=1..n\) where \(n\) is the sample size, \(Y\) is the univariate response, \(X\beta\) is the linear part, \(h(Z)\) is the kernel part and \(e\) are the residuals. The linear part is defined using the linear argument by specifying the covariates \(X\). It could be either a formula, a vector of length \(n\) if only one variable is included in the linear part or a \(n \times p\) design matrix containing the values of the \(p\) covariates included in the linear part (columns), for each individuals (rows). By default, an intercept is included. To remove the intercept term, use formula specification and add the term -1, as usual. Kernel part is defined using the kernel argument. It should be a formula of Kernel object(s). For a multiple kernel semi parametric model, Kernel objects are separated by the usual signs "+", "*" and ":" to specify addition and interaction between kernels. Specification formats of each Kernel object may be different. See Kernel for more information about their specification.

References

Liu, D., Lin, X., and Ghosh, D. (2007). Semiparametric regression of multidimensional genetic pathway data: least squares kernel machines and linear mixed models. Biometrics, 63(4), 1079:1088.

Kim, Choongrak, Byeong U. Park, and Woochul Kim. "Influence diagnostics in semiparametric regression models." Statistics and probability letters 60.1 (2002): 49:58.

Oualkacha, Karim, et al. "Adjusted sequence kernel association test for rare variants controlling for cryptic and family relatedness." Genetic epidemiology 37.4 (2013): 366:376.

Examples

Run this code

# NOT RUN {
x <- 1:15
z1 <- runif(15, 1, 6)
z2 <- rnorm(15, 1, 2)
y <- 3*x + (z1 + z2)^2 + rnorm(15, 0, 2)
fit <- kspm(y, linear = ~ x, kernel = ~ Kernel(~ z1 + z2,
kernel.function = "polynomial", d= 2, rho = 1, gamma = 0))
summary(fit)

# }

Run the code above in your browser using DataLab