The case--crossover method compares “case” days when events occurred (e.g.,
deaths) with control days to look for differences in exposure that might explain
differences in the number of cases. Control days are selected to be nearby to case
days, which means that only recent changes in the independent variable(s) are compared. By only
comparing recent values, any long-term or seasonal variation in the dependent and independent variable(s) can be eliminated. This elimination depends on the definition of nearby and on the seasonal
and long-term patterns in the independent variable(s).
Control and case days are only compared if they are in the same stratum. The stratum is controlled by stratalength
, the default value is 28 days, so that cases and controls are compared in four week sections.
Smaller stratum lengths provide a closer control for season, but reduce the available number of controls.
Control days that are close to the case day may have similar levels of the independent variable(s). To reduce this correlation it is possible to place an exclusion
around the cases.
The default is 2, which means that the smallest gap between a case and control will be 3 days.
To remove any confounding by day of the week it is possible to additionally match by day of the week (matchdow
), although this usually reduces the number of available controls. This matching is in addition to the strata matching.
It is possible to additionally match case and control days by an important confounder (matchconf
) in order to remove its effect. Control days are matched to case days if they are:
i) in the same strata, ii) have the same day of the week if matchdow=TRUE
, iii) have a value of matchconf
that is within plus/minus confrange
of the value of matchconf
on the case day. If the range is set too narrow then the number of available controls will become too small, which in turn means the number of case days with at least one control day is compromised.
The method uses conditional logistic regression (see coxph
and so the parameter estimates are odds ratios.)
The code assumes that the data frame contains a date variable (in Date
format) called ‘date’.