RLR: Regularized Logistic Regression

Description

Logistic Regression with lasso like penalties

Usage

RLR(X, Y, D, lambda, ...)

Value

A list with components:

coef: vector of coefficients
logLik: log likelihood value at the solution
status: return status from the Mosek optimizer

Arguments

X: a design matrix for the unconstrained logistic regression model
Y: a response vector of Boolean values, or n by 2 matrix of binomials as in glm
D: is a matrix specifying the penalty, diag(ncol(X)) for the conventional lasso penalty
lambda: a scalar specifying the intensity of one's belief in the prior. No provision for automatic selection has been made (yet).
...: other parameters passed to control optimization: These may include rtol the relative tolerance for dual gap convergence criterion, verb to control verbosity desired from mosek, verb = 0 is quiet, verb = 5 produces a fairly detailed iteration log. See the documentation for KWDual for further details.

Author

Roger Koenker with crucial help from Michal Adamaszek of Mosek ApS

Details

In some logistic regression problems, especially those with a large number of fixed effects like the Bradley-Terry rating model, it may be plausible to consider groups of effects that would be considered equivalence classes. One way to implement such prior information is to impose some form of regularization penalty. In the general formulation we are trying to solve the problem: $$ \min \ell (\theta | X, y) + \| D \theta \|_1 $$. For example in the Bradley-Terry rating model, we may consider penalties of the form, $$ \| D \theta \|_1 = \sum_{i < j} |\theta_i - \theta_j | $$ so differences in all pairs of ratings are pulled together. This form of the penalty has been used by Hocking et al (2011) for clustering, by Masarotto and Varin (2012) for estimation of the Bradley Terry model and by Gu and Volgushev (2019) for grouping fixed effects in panel data models. This is an implementation in Mosek, so the package Rmosek and Mosek must be available at run time. The demo(RLR1) illustrates use with the conventional lasso penalty and produces a lasso shrinkage plot. The demo(RLR2) illustrates use with the ranking/grouping lasso penalty and produces a plot of how the number of groups is reduced as lambda rises.

References

Gu, J. and Volgushev, S. (2019), `Panel data quantile regression with grouped fixed effects', Journal of Econometrics, 213, 68--91.

Hocking, T. D., Joulin, A., Bach, F. and Vert, J.-P. (2011), `Clusterpath: an algorithm for clustering using convex fusion penalties', Proceedings of the 28th International Conference on International Conference on Machine Learning, 745--752.

Masarotto, G. and Varin, C. (2012), `The ranking lasso and its application to sport tournaments', The Annals of Applied Statistics, 6, 1949--1970.