The package dlnm contains functions to specify and interpret distributed lag linear (DLMs) and non-linear (DLNMs) models. These functions are used to build basis and cross-basis matrices and then to predict and plot the results of a fitted model.
Distributed lag non-linear models (DLNMs) represent a modelling framework to describe simultaneously non-linear and delayed dependencies, termed as exposure-lag-response associations. These include models for linear exposure-responses (DLMs) as special cases. The methodology of DLMs and DLNMs was originally developed for time series data, and has been recently extended to other study designs and data structures, compatible with cohort, case-control or longitudinal analyses, amongst others. A thorough methodological overview is given in the references and the package vignettes detailed below.
The modelling framework is based on the definition of a cross-basis, a bi-dimensional space of functions specifying the dependency along the space of the predictor and along lags. The cross-basis functions are built combining the basis functions for the two dimensions, produced by applying existing or user-defined functions such as splines, polynomials, linear threshold or indicators.
The application of DLMs and DLNMs requires the availability of predictor values measured at equally-spaced time points. In the original development in time series analysis, these are represented by the ordered series of observations. More generally, the data can be stored in a matrix of exposure histories, where each row represents the lagged values of the predictor for each observation.
The cross-basis matrix of transformed variables is included in the model formula of a regression model to estimate the associated parameters. The estimation can be carried out with the default regression functions, such as lm
, glm
, gam
(package mgcv), clogit
and coxph
(package survival), lme
(package nlme), lmer
and glmer
(package lme4). Estimates are then extracted to obtain predictions and graphical representations which facilitate the interpretation of the results.
In the standard usage, crossbasis
creates two set of basis functions from a time series vector or a matrix of exposure histories to define the relationship in the two dimensions of predictor and lags. This step is performed through a call to the function onebasis
, which in turn internally calls existing or user-defined functions and produces a basis matrix of class "crossbasis"
with specific attributes. Standard choices for the functions in the two dimension are ns
or bs
from package splines, or the internal functions poly
, strata
, thr
, integer
and lin
in dlnm. Other existing of user-defined functions can be also chosen. The functions equalknots
and logknots
can be used for knot placement. The two basis matrices are then combined in a matrix object of class "crossbasis"
, containing the transformed variables to be included in the model formula.
In a more recent development, a penalized version of DLMs and DLNMs can be performed using two alternative approaches. In the external method, the functions ps
or cr
are called in crossbasis
to derive the spline transformations, and the function cbPen
is used to form the list of bi-dimensional penalty matrices. In the internal method, the cross-basis parameterization and matrix penalization are obtained directly using smooth.construct.cb.smooth.spec
, a specific smooth constructor of class "cb"
. This is used within the function s
in the model formula. In both cases, the model is fitted using the regression function gam
in mgcv.
After the model fitting, crosspred
generates predictions for a set of suitable values of the original predictor and lag period, and stores them in a "crosspred"
object. The function exphist
can be used to generate exposure histories for predictions. The fit of a DLM or DLNM can be reduced and re-expressed as the chosen function of one of the two dimensions through the function crossreduce
. It returns a "crossreduce"
object storing the new parameters and predictions.
Method functions are available for objects "onebasis"
, "crossbasis"
, "crosspred"
and "crossreduce"
. Specific summary
methods summarize the content of each object. The plotting functions plot
, lines
and points
, offer a set of choices to plot the results, while coef
and vcov
return the coefficients and associated (co)variance matrix for a (optionally reduced) model.
The data set chicagoNMMAPS
is provided to perform examples of use of dlnm in time series analysis. It includes time series data of daily mortality counts, weather and pollution variables for Chicago in the period 1987-2000. The data sets nested
and drug
include simulated data to illustrate the extension of dlnm to other study designs, specifically nested case-controls and randomized controlled trials. The former contains information on 300 risk sets each with one cancer case and one matched control, and an occupational exposure collected in 5-year periods. The latter contains information on 200 subjects who are randomly allocated a different dose of a drug for two out of four weeks, with their outcome measured after 28 days.
Additonal details on the package dlnm are available in the vignettes included in the installation. These documents offer a detailed description of the capabilities of the package, and some examples of application to real data, with an extensive illustration of the use of the functions.
The vignette dlnmOverview offers a general illustration of the DLM/DLNM methodology and the functions included in the package. The vignette dlnmTS illustrates specific examples on the use of the functions for time series analysis. The vignette dlnmExtended provides some examples on the extension of the methodology and package in other study designs and on the use of user-written functions. The vignette dlnmPenalized describes the definition of DLMs and DLNMs through penalized splines.
A vignette is available by typing:
vignette("dlnmOverview")
A list of changes included in the current and previous versions can be found by typing:
news(package="dlnm")
The dlnm package is available on the Comprehensive R Archive Network (CRAN), with info at the related web page (CRAN.R-project.org/package=dlnm). A development website is available on GitHub (github.com/gasparrini/dlnm). General information on the development and applications of the DLM/DLNM modelling framework, together with an updated version of the R scripts for running the examples in published papers, can be found on GitHub (github.com/gasparrini) or at the personal web page of the package maintainer (www.ag-myresearch.com).
Please use citation("dlnm")
to cite this package.
Gasparrini A. Distributed lag linear and non-linear models in R: the package dlnm. Journal of Statistical Software. 2011;43(8):1-20. [freely available here].
Gasparrini A, Scheipl F, Armstrong B, Kenward MG. A penalized framework for distributed lag non-linear models. Biometrics. 2017;73(3):938-948. [freely available here]
Gasparrini A. Modelling lagged associations in environmental time series data: a simulation study. Epidemiology. 2016;27(6):835-842. [freely available here]
Gasparrini A. Modeling exposure-lag-response associations with distributed lag non-linear models. Statistics in Medicine. 2014;33(5):881-899. [freely available here]
Gasparrini A, Armstrong, B, Kenward MG. Distributed lag non-linear models. Statistics in Medicine. 2010;29(21):2224-2234. [freely available here]
Gasparrini A, Armstrong B, Kenward MG. Reducing and meta-analyzing estimates from distributed lag non-linear models.BMC Medical Research Methodology. 2013;13(1):1. [freely available here].
Armstrong B. Models for the relationship between ambient temperature and daily mortality. Epidemiology. 2006;17(6):624-31. [available here]
onebasis
to generate simple basis matrices. crossbasis
to generate cross-basis matrices. cb smooth constructor
for a penalized version. crosspred
to obtain predictions after model fitting. crossreduce
to reduce the fit to one dimension. The methods plot.crosspred
and plot.crossreduce
to plot several type of graphs.
Type 'vignette(dlnmOverview)'
for a detailed description.