davies.test: Testing for a change in the slope

Description

Given a generalized linear model, the Davies' test can be employed to test for a non-constant regression parameter in the linear predictor.

Usage

davies.test(obj, seg.Z, k = 10, alternative = c("two.sided", "less", "greater"), 
    type=c("lrt","wald"), values=NULL, dispersion=NULL)

Value

A list with class 'htest' containing the following components:

method: title (character)
data.name: the regression model and the segmented variable being tested
statistic: the point within the range of the covariate in seg.Z at which the maximum (or the minimum if alternative="less") occurs
parameter: number of evaluation points
p.value: the adjusted p-value
process: a two-column matrix including the evaluation points and corresponding values of the test statistic

Arguments

obj: a fitted model typically returned by glm or lm. Even an object returned by segmented can be set (e.g. if interest lies in testing for an additional breakpoint).
seg.Z: a formula with no response variable, such as seg.Z=~x1, indicating the (continuous) segmented variable being tested. Only a single variable may be tested and an error is printed when seg.Z includes two or more terms. seg.Z can be omitted if i)obj is a segmented fit with a single segmented covariate (and that variable is taken), or ii)if it is a "lm" or "glm" fit with a single covariate (and that variable is taken)
k: number of points where the test should be evaluated. See Details.
alternative: a character string specifying the alternative hypothesis (relevant to the slope difference parameter).
type: the test statistic to be used (only for GLM, default to lrt). Ignored if obj is a simple linear model.
values: optional. The evaluation points where the Davies approximation is computed. See Details for default values.
dispersion: the dispersion parameter for the family to be used to compute the test statistic. When NULL (the default), it is inferred from obj. Namely it is taken as 1 for the Binomial and Poisson families, and otherwise estimated by the residual Chi-squared statistic (calculated from cases with non-zero weights) divided by the residual degrees of freedom.

Author

Vito M.R. Muggeo

Warning

The Davies test is not aimed at obtaining the estimate of the breakpoint. The Davies test is based on k evaluation points, thus the value returned in the statistic component (and printed as "'best' at") is the best among the k points, and typically it will differ from the maximum likelihood estimate returned by segmented. Use segmented if you are interested in the point estimate.

To test for a breakpoint in linear models with small samples, it is suggested to use davies.test() with objects of class "lm". If obj is a "glm" object with gaussian family, davies.test() will use an approximate test resulting in smaller p-values when the sample is small. However if the sample size is large (n>300), the exact Davies (2002) upper bound cannot be computed (as it relies on gamma() function) and the approximate upper bound of Davies (1987) is returned.

Details

davies.test tests for a non-zero difference-in-slope parameter of a segmented relationship. Namely, the null hypothesis is \(H_0:\beta=0\), where \(\beta\) is the difference-in-slopes, i.e. the coefficient of the segmented function \(\beta(x-\psi)_+\). The hypothesis of interest \(\beta=0\) means no breakpoint. Roughtly speaking, the procedure computes k `naive' (i.e. assuming fixed and known the breakpoint) test statistics for the difference-in-slope, seeks the `best' value and corresponding naive p-value (according to the alternative hypothesis), and then corrects the selected (minimum) p-value by means of the k values of the test statistic. If obj is a LM, the Davies (2002) test is implemented. This approach works even for small samples. If obj represents a GLM fit, relevant methods are described in Davies (1987), and the Wald or the Likelihood ratio test statistics can be used, see argument type. This is an asymptotic test. The k evaluation points are k equally spaced values between the second and the second-last values of the variable reported in seg.Z. k should not be small; I find no important difference for k larger than 10, so default is k=10.

References

Davies, R.B. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 74, 33--43.

Davies, R.B. (2002) Hypothesis testing when a nuisance parameter is present only under the alternative: linear model case. Biometrika 89, 484--489.

Examples

Run this code

if (FALSE) {
set.seed(20)
z<-runif(100)
x<-rnorm(100,2)
y<-2+10*pmax(z-.5,0)+rnorm(100,0,3)

o<-lm(y~z+x)
davies.test(o,~z)
davies.test(o,~x)

o<-glm(y~z+x)
davies.test(o,~z) #it works but the p-value is too small..
  }

Run the code above in your browser using DataLab