Learn R Programming

robustgam (version 0.1.7)

robustgam: Pseudo Data Algorithm for Robust Estmation of GAMs

Description

This function implements a fast and stable algorithm developed in Wong, Yao and Lee (2013) for robust estimation of generalized additive models. Currently, this implementation only covers binomial and poisson distributions. This function does not choose smoothing parameter by itself and the user has to specify it manunally. For implementation with automatic smoothing parameter selection, see robustgam.GIC, robustgam.GIC.optim, robustgam.CV. For prediction, see pred.robustgam.

Usage

robustgam(X, y, family, p=3, K=30, c=1.345, sp=-1, show.msg=FALSE, count.lim=200, w.count.lim=50, smooth.basis="tp", wx=FALSE)

Arguments

X
a vector or a matrix (each covariate form a column) of covariates
y
a vector of responses
family
A family object specifying the distribution and the link function. See glm and family.
p
order of the basis. It depends on the option of smooth.basis.
K
number of knots of the basis; dependent on the option of smooth.basis.
c
tunning parameter for Huber function; a smaller value of c corresponds to a more robust fit. It is recommended to set as 1.2 and 1.6 for binomial and poisson distribution respectively.
sp
a vector of smoothing parameter. If only one value is specified, it will be used for all smoothing parameters.
show.msg
If show.msg=T, progress is displayed.
count.lim
maximum number of iterations of the whole algorithm
w.count.lim
maximum number of updates on the weight. It corresponds to zeta in Wong, Yao and Lee (2013)
smooth.basis
the specification of basis. Four choices are available: "tp" = thin plate regression spline, "cr" = cubic regression spline, "ps" = P-splines, "tr" = truncated power spline. For more details, see smooth.construct.
wx
If wx=T, robust weight on the covariates are applied. For details, see Real Data Example in Wong, Yao and Lee (2013)

Value

fitted.values
fitted values
initial.fitted
the starting values of the algorithm
beta
estimated coefficients (corresponding to the basis)
B
the basis: fitted linear estimator = B%\*%beta
sD
for internal use
basis
the smooth construct object. For more details, see smooth.construct
converge
If converge=T, the algorithm converged.
w
for internal use
family
the family object
wx
Indicate whether robust weight on covariates is applied
beta.fit
for internal use

References

Raymond K. W. Wong, Fang Yao and Thomas C. M. Lee (2013) Robust Estimation for Generalized Additive Models. Journal of Graphical and Computational Statistics, to appear.

See Also

robustgam.GIC, robustgam.GIC.optim, robustgam.CV, pred.robustgam

Examples

Run this code
# load library
library(robustgam)

# test function
test.fun <- function(x, ...) {
  return(2*sin(2*pi*(1-x)^2))
}

# some setting
set.seed(1234)
true.family <- poisson()
out.prop <- 0.05
n <- 100

# generating dataset for poisson case
x <- runif(n)
x <- x[order(x)]
true.eta <- test.fun(x)
true.mu <- true.family$linkinv(test.fun(x))
y <- rpois(n, true.mu) # for poisson case

# create outlier for poisson case
out.n <- trunc(n*out.prop)
out.list <- sample(1:n, out.n, replace=FALSE)
y[out.list] <- round(y[out.list]*runif(out.n,min=3,max=5)^(sample(c(-1,1),out.n,TRUE)))

# robust GAM fit
robustfit <- robustgam(x, y, family=true.family, p=3, c=1.6, sp=0.000143514, show.msg=FALSE,
 smooth.basis='tp')
## the smoothing parameter is selected by the RBIC, the command is:
# robustfit.gic <- robustgam.GIC.optim(x, y, family=true.family, p=3, c=1.6, show.msg=FALSE,
#   count.lim=400, smooth.basis='tp', lsp.initial=log(1e-2) ,lsp.min=-15, lsp.max=10,
#   gic.constant=log(n), method="L-BFGS-B"); robustfit <- robustfit.gic$optim.fit


# ordinary GAM fit
nonrobustfit <- gam(y~s(x, bs="tp", m=3),family=true.family) # m = p for 'tp'

# prediction
x.new <- seq(range(x)[1], range(x)[2], len=1000)
robustfit.new <- pred.robustgam(robustfit, data.frame(X=x.new))$predict.values
nonrobustfit.new <- as.vector(predict.gam(nonrobustfit,data.frame(x=x.new),type="response"))

# plot
plot(x, y)
lines(x.new, true.family$linkinv(test.fun(x.new)), col="blue")
lines(x.new, robustfit.new, col="red")
lines(x.new, nonrobustfit.new, col="green")
legend(0.6, 23, c("true mu", "robust fit", "nonrobust fit"), col=c("blue","red","green"),
  lty=c(1,1,1))

Run the code above in your browser using DataLab