kerndwd: solve Linear DWD and Kernel DWD

Description

Fit the linear generalized distance weighted discrimination (DWD) model and the generalized DWD on Reproducing kernel Hilbert space. The solution path is computed at a grid of values of tuning parameter lambda.

Usage

kerndwd(x, y, kern, lambda, qval=1, wt, eps=1e-05, maxit=1e+05)

Arguments

A numerical matrix with $N$ rows and $p$ columns for predictors.

A vector of length $N$ for binary responses. The element of y is either -1 or 1.

kern

A kernel function; see dots.

lambda

A user supplied lambda sequence.

qval

The exponent index of the generalized DWD. Default value is 1.

A vector of length $n$ for weight factors. When wt is missing or wt=NULL, an unweighted DWD is fitted.

eps

The algorithm stops when (i.e. $\sum_j(\beta_j^{new}-\beta_j^{old})^2$ is less than eps, where $j=0,\ldots, p$. Default value is 1e-5.

maxit

The maximum of iterations allowed. Default is 1e5.

Value

An object with S3 class kerndwd.

alpha

A matrix of DWD coefficients at each lambda value. The dimension is (p+1)*length(lambda) in the linear case and (N+1)*length(lambda) in the kernel case.

lambda

The lambda sequence.

npass

Total number of MM iterations for all lambda values.

jerr

Warnings and errors; 0 if none.

info

A list including parameters of the loss function, eps, maxit, kern, and wt if a weight vector was used.

call

The call that produced this object.

Details

Suppose that the generalized DWD loss is $V_q(u)=1-u$ if $u \le q/(q+1)$ and $\frac{1}{u^q}\frac{q^q}{(q+1)^{(q+1)}}$ if $u > q/(q+1)$. The value of $\lambda$, i.e., lambda, is user-specified.

In the linear case (kern is the inner product and N > p), the kerndwd fits a linear DWD by minimizing the L2 penalized DWD loss function, $$\frac{1}{N}\sum_{i=1}^n V_q(y_i(\beta_0 + X_i'\beta)) + \lambda \beta' \beta.$$

If a linear DWD is fitted when N < p, a kernel DWD with the linear kernel is actually solved. In such case, the coefficient $\beta$ can be obtained from $\beta = X'\alpha.$

In the kernel case, the kerndwd fits a kernel DWD by minimizing $$\frac{1}{N}\sum_{i=1}^n V_q(y_i(\beta_0 + K_i' \alpha)) + \lambda \alpha' K \alpha,$$ where $K$ is the kernel matrix and $K_i$ is the ith row.

The weighted linear DWD and the weighted kernel DWD are formulated as follows, $$\frac{1}{N}\sum_{i=1}^n w_i \cdot V_q(y_i(\beta_0 + X_i'\beta)) + \lambda \beta' \beta,$$ $$\frac{1}{N}\sum_{i=1}^n w_i \cdot V_q(y_i(\beta_0 + K_i' \alpha)) + \lambda \alpha' K \alpha,$$ where $w_i$ is the ith element of wt. The choice of weight factors can be seen in the reference below.

References

Wang, B. and Zou, H. (2018) ``Another Look at Distance Weighted Discrimination," Journal of Royal Statistical Society, Series B, 80(1), 177--198. https://rss.onlinelibrary.wiley.com/doi/10.1111/rssb.12244 Karatzoglou, A., Smola, A., Hornik, K., and Zeileis, A. (2004) ``kernlab -- An S4 Package for Kernel Methods in R", Journal of Statistical Software, 11(9), 1--20. https://www.jstatsoft.org/v11/i09/paper Friedman, J., Hastie, T., and Tibshirani, R. (2010), "Regularization paths for generalized linear models via coordinate descent," Journal of Statistical Software, 33(1), 1--22. https://www.jstatsoft.org/v33/i01/paper Marron, J.S., Todd, M.J., and Ahn, J. (2007) ``Distance-Weighted Discrimination"", Journal of the American Statistical Association, 102(408), 1267--1271. https://www.tandfonline.com/doi/abs/10.1198/016214507000001120 Qiao, X., Zhang, H., Liu, Y., Todd, M., Marron, J.S. (2010) ``Weighted distance weighted discrimination and its asymptotic properties", Journal of the American Statistical Association, 105(489), 401--414. https://www.tandfonline.com/doi/abs/10.1198/jasa.2010.tm08487

Examples

Run this code

# NOT RUN {
data(BUPA)
# standardize the predictors
BUPA$X = scale(BUPA$X, center=TRUE, scale=TRUE)

# a grid of tuning parameters
lambda = 10^(seq(3, -3, length.out=10))

# fit a linear DWD
kern = vanilladot()
DWD_linear = kerndwd(BUPA$X, BUPA$y, kern,
  qval=1, lambda=lambda, eps=1e-5, maxit=1e5)

# fit a DWD using Gaussian kernel
kern = rbfdot(sigma=1)
DWD_Gaussian = kerndwd(BUPA$X, BUPA$y, kern,
  qval=1, lambda=lambda, eps=1e-5, maxit=1e5)

# fit a weighted kernel DWD
kern = rbfdot(sigma=1)
weights = c(1, 2)[factor(BUPA$y)]
DWD_wtGaussian = kerndwd(BUPA$X, BUPA$y, kern,
  qval=1, lambda=lambda, wt = weights, eps=1e-5, maxit=1e5)
# }

Run the code above in your browser using DataLab