gam.lo: Specify a loess fit in a GAM formula

Description

A symbolic wrapper to indicate a smooth term in a formala argument to gam

Usage

gam.lo(
  x,
  y,
  w = rep(1, length(y)),
  span = 0.5,
  degree = 1,
  ncols = p,
  xeval = x
)
lo(..., span = 0.5, degree = 1)

Value

lo returns a numeric matrix. The simplest case is when there is a single argument to lo and degree=1; a one-column matrix is returned, consisting of a normalized version of the vector. If degree=2 in this case, a two-column matrix is returned, consisting of a degree-2 polynomial basis. Similarly, if there are two arguments, or the single argument is a two-column matrix, either a two-column matrix is returned if degree=1, or a five-column matrix consisting of powers and products up to degree 2. Any dimensional argument is allowed, but typically one or two vectors are used in practice.

The matrix is endowed with a number of attributes; the matrix itself is used in the construction of the model matrix, while the attributes are needed for the backfitting algorithms general.wam (weighted additive model) or lo.wam (currently not implemented). Local-linear curve or surface fits reproduce linear responses, while local-quadratic fits reproduce quadratic curves or surfaces. These parts of the loess fit are computed exactly together with the other parametric linear parts

When two or more smoothing variables are given, the user should make sure they are in a commensurable scale; lo() does no normalization. This can make a difference, since lo() uses a spherical (isotropic) neighborhood when establishing the nearest neighbors.

Note that lo itself does no smoothing; it simply sets things up for gam; gam.lo does the actual smoothing. of the model.

One important attribute is named call. For example, lo(x) has a call component gam.lo(data[["lo(x)"]], z, w, span = 0.5, degree = 1, ncols = 1). This is an expression that gets evaluated repeatedly in general.wam (the backfitting algorithm).

gam.lo returns an object with components

residuals: The residuals from the smooth fit. Note that the smoother removes the parametric part of the fit (using a linear fit with the columns in x), so these residual represent the nonlinear part of the fit.
nl.df: the nonlinear degrees of freedom
var: the pointwise variance for the nonlinear fit

When gam.lo is evaluated with an xeval argument, it returns a matrix of predictions.

Arguments

x: for gam.lo, the appropriate basis of polynomials generated from the arguments to lo. These are also the variables that receive linear coefficients in the GAM fit.
y: a response variable passed to gam.lo during backfitting
w: weights
span: the number of observations in a neighborhood. This is the smoothing parameter for a loess fit. If specified, the full argument name span must be written.
degree: the degree of local polynomial to be fit; currently restricted to be 1 or 2. If specified, the full argument name degree must be written.
ncols: for gam.lo the number of columns in x used as the smoothing inputs to local regression. For example, if degree=2, then x has two columns defining a degree-2 polynomial basis. Both are needed for the parameteric part of the fit, but ncol=1 telling the local regression routine that the first column is the actually smoothing variable.
xeval: If this argument is present, then gam.lo produces a prediction at xeval.
...: the unspecified ...{} can be a comma-separated list of numeric vectors, numeric matrix, or expressions that evaluate to either of these. If it is a list of vectors, they must all have the same length.

Author

Written by Trevor Hastie, following closely the design in the "Generalized Additive Models" chapter (Hastie, 1992) in Chambers and Hastie (1992).

Details

A smoother in gam separates out the parametric part of the fit from the non-parametric part. For local regression, the parametric part of the fit is specified by the particular polynomial being fit locally. The workhorse function gam.lo fits the local polynomial, then strips off this parametric part. All the parametric pieces from all the terms in the additive model are fit simultaneously in one operation for each loop of the backfitting algorithm.

References

Hastie, T. J. (1992) Generalized additive models. Chapter 7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

Hastie, T. and Tibshirani, R. (1990) Generalized Additive Models. London: Chapman and Hall.

Examples

Run this code


y ~ Age + lo(Start)
     # fit Start using a loess smooth with a (default) span of 0.5.
y ~ lo(Age) + lo(Start, Number) 
y ~ lo(Age, span=0.3) # the argument name span cannot be abbreviated.

Run the code above in your browser using DataLab