calculate_f_test: Function to compute F-statistic and p-value from deviances

Description

Alternative to likelihood ratio tests in normal / Gaussian error models.

Usage

calculate_f_test(deviances, dfs_resid, n_obs, d1 = NULL)

Value

A list with three entries giving the test statistic and p-value for the F-test for the comparison of deviance[1] to deviance[2].

statistic: test statistic.
pvalue: p-value.
dev_diff: difference in deviances tested.

Arguments

deviances: a numeric vector of length 2 with deviances. Typically ordered in increasing order (i.e. null model first, then full model) and used to test the difference deviances[1] - deviances[2].
dfs_resid: a numeric vector with residual degrees of freedom.
n_obs: a numeric value with the number of observations.
d1: a numeric value giving d1 in the formula below directly as the number of additional degrees of freedom in model 2 compared to model 1. In this case dfs_resid must be a single numeric value giving the residual df for model 2. This interface is sometimes more convenient than to specify both residual dfs.

Details

Uses formula on page 23 from here: https://www.stata.com/manuals/rfp.pdf: $$F = \frac{d_2}{d_1} (exp(\frac{D_2 - D_1}{n}) - 1),$$ where $D$ refers to deviances of two models 1 and 2. $d1$ is the number of additional parameters used in in model 2 as compared to model 1, i.e. dfs_resid[1] - dfs_resid[2]. $d2$ is the number of residual degrees of freedom minus the number of estimated powers for model 2, i.e. dfs_resid[2]. #' The p-value then results from the use of a F-distribution with (d1, d2) degrees of freedom.

Note that this computation is completely equivalent to the computation of a F-test using sum of squared errors as in e.g. Kutner at al. (2004), p 263. The formula there is given as $$F = \frac{SSE(R) - SSE(F)}{df_R - df_F} / \frac{SSE(F)}{df_F},$$ where the $df$ terms refer to residual degrees of freedom, and $R$ and $F$ to the reduced (model 1) and full model (model 2), respectively.

References

Kutner, M.H., et al., 2004. Applied linear statistical models. McGraw-Hill Irwin.