crossbasis: Generate a Cross-Basis Matrix for a DLNM

Description

The function generates the basis matrices for the two dimensions of predictor and lags, choosing among a set of possible basis functions. Then, these functions are combined in order to create the related cross-basis matrix, which can be included in a model formula to fit a distributed lag non-linear model (DLNM).

Usage

crossbasis(x, lag=c(0,0), argvar=list(), arglag=list(), group=NULL, ...)

## S3 method for class 'crossbasis':
summary(object, ...)

Arguments

the predictor variable, defined as a numeric vector representing a complete series of ordered observations.

lag

either an integer scalar or vector of length 2, defining the the maximum lag or the lag range, respectively. If a scalar, the minimum is automatically set ot 0.

argvar, arglag

lists of arguments to be passed to the function onebasis for generating the two basis matrices for predictor and lags, respectively. See Details below.

group

a factor defining groups of observations, representing multiple series. Each series must be consecutive, complete and ordered.

object

a object of class "crossbasis".

...

additional arguments. See Details below.

Value

A matrix object of class "crossbasis" which can be included in a model formula in order to fit a DLNM. It contains the attributes range (range of the original vector of observations), lag (lag range), argvar and arglag (lists of arguments defining the basis functions in each space, which can be modified if compared to the arguments above). The function summary.crossbasis returns a summary of the cross-basis matrix and the related attributes, and can be used to check the options for the basis functions chosen for the two dimensions.

Warnings

Meaningless combinations of arguments (for example the inclusion of knots lying outside the range for type equal to "strata" or thr-type) could lead to collinear variables, with identifiability problems in the model and the exclusion of some of them. It is strongly recommended to avoid the inclusion of an intercept in the basis for x (int in argvar should be FALSE, as default), otherwise a rank-deficient cross-basis matrix will be specified, causing some of the cross-variables to be excluded in the regression model. Conversely, an intercept is included by default in the basis for the space of lags.

Details

Until version 1.5.0, the function adopted a completely different usage, with different arguments. The compatibility of the old code is retained by the additional arguments passed through .... Users are however suggested to adopt the current usage. The arguments in argvar and arglag (optionally including type, df, degree, knots, bound, int, cen) define two set of basis functions for each dimension. The function onebasis is called internally, to build the related basis matrices. The argvar list is applied to x, in order to generate the matrix for the space of the predictor. The arglag list is applied to a new vector given by the sequence obtained by lag, in order to generate the matrix for the space of lags. Then, the two set of basis matrices are combined in order to create the related cross-basis matrix. See onebasis for additional information on how to specify each basis. Results from DLNM are interpreted relatively to a reference value of the predictor, determined automatically or through a centering point. See onebasis for further details. The basis functions for lags are defined with different default arguments than in onebasis: specifically, the knots are placed at equally spaced values on the log scale, an intercept is always included (see Warnings below), and the basis is never centered. Some arguments can be automatically changed for not sensible combinations, or set to NULL if not required. Use summary.crossbasis to check the result. The argument group defines groups of observations representing independent series. Each series must be consecutive, complete and ordered. crossbasis is run on each of them applying the same cross-basis functions: default choices (knots position, range, etc.) are taken considering the pooled distribution. For a detailed illustration of the use of the function, see: vignette("dlnmOverview")

References

Gasparrini A. Distributed lag linear and non-linear models in R: the package dlnm. Journal of Statistical Software. 2011; 43(8):1-20. [freely available http://www.ag-myresearch.com/jss2011{here}]. Gasparrini A., Armstrong, B.,Kenward M. G. Distributed lag non-linear models. Statistics in Medicine. 2010; 29(21):2224-2234. [freely available http://www.ag-myresearch.com/statmed2010{here}]

Examples

Run this code

### simple DLM
### space of predictor: linear relationship for PM10
### space of predictor: 5df natural cubic spline for temperature
### lag function: 4th degree polynomial for PM10 up to lag15
### lag function: strata intervals at lag 0 and 1-3 for temperature

# CREATE THE CROSS-BASIS FOR EACH PREDICTOR AND CHECK WITH SUMMARY
cb1.pm <- crossbasis(chicagoNMMAPS$pm10, lag=15, argvar=list(type="lin",cen=0),
  arglag=list(type="poly",degree=4))
cb1.temp <- crossbasis(chicagoNMMAPS$temp, lag=3, argvar=list(df=5,cen=21),
  arglag=list(type="strata",knots=1))
summary(cb1.pm)
summary(cb1.temp)

# RUN THE MODEL AND GET THE PREDICTION FOR PM10
library(splines)
model1 <- glm(death ~ cb1.pm + cb1.temp + ns(time, 7*14) + dow,
	family=quasipoisson(), chicagoNMMAPS)
pred1.pm <- crosspred(cb1.pm, model1, at=0:20, bylag=0.2, cumul=TRUE)

# PLOT THE LINEAR ASSOCIATION OF PM10 ALONG LAGS
plot(pred1.pm, "slices", var=10, col=3, ylab="RR", ci.arg=list(density=15,lwd=2),
	main="Association with a 10-unit increase in PM10")
plot(pred1.pm, "slices", var=10, cumul=TRUE, ylab="Cumulative RR",
	main="Cumulative association with a 10-unit increase in PM10")

# GET THE FIGURES FOR THE OVERALL CUMULATIVE ASSOCIATION, WITH CI
pred1.pm$allRRfit["10"]
cbind(pred1.pm$allRRlow, pred1.pm$allRRhigh)["10",]

### See the vignette 'dlnmOverview' for a detailed explanation of this example

Run the code above in your browser using DataLab