Alternative to likelihood ratio tests in normal / Gaussian error models.
calculate_f_test(deviances, dfs_resid, n_obs, d1 = NULL)
A list with three entries giving the test statistic and p-value for the F-test
for the comparison of deviance[1]
to deviance[2]
.
statistic
: test statistic.
pvalue
: p-value.
dev_diff
: difference in deviances tested.
a numeric vector of length 2 with deviances. Typically
ordered in increasing order (i.e. null model first, then full model) and
used to test the difference deviances[1] - deviances[2]
.
a numeric vector with residual degrees of freedom.
a numeric value with the number of observations.
a numeric value giving d1
in the formula below directly as
the number of additional degrees of freedom in model 2 compared to model 1.
In this case dfs_resid
must be a single numeric value giving the residual
df for model 2. This interface is sometimes more convenient than to specify
both residual dfs.
Uses formula on page 23 from here: https://www.stata.com/manuals/rfp.pdf:
$$F = \frac{d_2}{d_1} (exp(\frac{D_2 - D_1}{n}) - 1),$$
where \(D\) refers to deviances of two models 1 and 2.
\(d1\) is the number of additional parameters used in in model 2 as
compared to model 1, i.e. dfs_resid[1] - dfs_resid[2]
.
\(d2\) is the number of residual degrees of freedom minus the number of
estimated powers for model 2, i.e. dfs_resid[2]
.
#' The p-value then results from the use of a F-distribution with
(d1, d2) degrees of freedom.
Note that this computation is completely equivalent to the computation of a F-test using sum of squared errors as in e.g. Kutner at al. (2004), p 263. The formula there is given as $$F = \frac{SSE(R) - SSE(F)}{df_R - df_F} / \frac{SSE(F)}{df_F},$$ where the \(df\) terms refer to residual degrees of freedom, and \(R\) and \(F\) to the reduced (model 1) and full model (model 2), respectively.
Kutner, M.H., et al., 2004. Applied linear statistical models. McGraw-Hill Irwin.