irglm: fit a robust generalized linear models

Description

Fit a robust GLM where the loss function is a composite function cfunodfun.

Usage

# S3 method for formula
irglm(formula, data, weights, offset=NULL, contrasts=NULL,
 cfun="ccave", dfun=gaussian(), s=NULL, delta=0.1, fk=NULL, init.family=NULL,
 iter=10, reltol=1e-5, theta, x.keep=FALSE, y.keep=TRUE, trace=FALSE, ...)

Value

An object with S3 class "irglm", "glm" for various types of models.

call: the call that produced the model fit
weights: original weights used in the model
weights_update: weights in the final iteration of the IRGLM algorithm
cfun, s, dfun: original input arguments
is.offset: is offset used?

Arguments

formula

symbolic description of the model, see details.

data

argument controlling formula processing via model.frame.

weights

optional numeric vector of weights.

x

input matrix, of dimension nobs x nvars; each row is an observation vector

y

response variable. Quantitative for dfun=1 and -1/1 for classification.

contrasts

the contrasts corresponding to levels from the respective models

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. Currently only one offset term can be included in the formula.

cfun

character, type of convex cap (concave) function.
Valid options are:

"hcave"
"acave"
"bcave"
"ccave"
"dcave"
"ecave"
"gcave"
"tcave"

dfun

character, type of convex component.
Valid options are:

init.family

character value for initial family, one of "clossR","closs","gloss","qloss", which can be used to derive an initial estimator, if the selection is different from the default value

s

tuning parameter of cfun. s > 0 and can be equal to 0 for cfun="tcave". If s is too close to 0 for cfun="acave", "bcave", "ccave", the calculated weights can become 0 for all observations, thus crash the program.

delta

a small positive number provided by user only if cfun="gcave" and 0 < s <1

fk

predicted values at an iteration in the IRGLM algorithm

iter

number of iteration in the IRGLM algorithm

reltol

convergency criteria in the IRGLM algorithm

theta

an overdispersion scaling parameter for family=negbin()

x.keep, y.keep

logical values indicating whether the response vector and model matrix used in the fitting process should be returned as components of the returned value, x is a design matrix of dimension n * p, and x is a vector of observations of length n.

trace

if TRUE, fitting progress is reported

...

other arguments passing to irglm

Author

Zhu Wang <zwang145@uthsc.edu>

Details

A robust linear, logistic or Poisson regression model is fit by the iteratively reweighted GLM (IRGLM). The output weights_update is a useful diagnostic to the outlier status of the observations.

References

Zhu Wang (2024) Unified Robust Estimation, Australian & New Zealand Journal of Statistics. 66(1):77-102.

Examples

Run this code

x=matrix(rnorm(100*20),100,20)
g2=sample(c(-1,1),100,replace=TRUE)
fit=irglm(g2~x,data=data.frame(cbind(x, g2)), s=1, cfun="ccave", dfun=gaussian())
fit$weights_update

Run the code above in your browser using DataLab