Gives the sample size necessary to demonstrate that a coefficient
in the model for the
given predictor is equal to its given value
rather than equal to zero (or, if OR
is supplied,
the sample size needed to check for such a change in probability).
Also, the number of events per predictor.
This is the smaller value of the outcome \(y=0\) and outcome \(y=1\).
For a continuous coefficient, the calculation uses
\(\hat{\beta}\), the estimated coefficient from the model,
\(\delta\):
$$\delta = \frac{1 + (1 + \hat{\beta}^2) \exp{1.25\hat{\beta}^2}}{
1 + \exp{-0.25 \hat{\beta}^2}}$$
and \(P_0\), the probability calculated from the intercept term
\(\beta_0\) from the logistic model
glm(x$y ~ coeff, family=binomial)
as
\(P_0 = \frac{\exp{\beta_0}}{1 + \exp{\beta_0}}\)
For a model with one predictor, the calculation is:
$$n = (1 + 2P_0 \delta)
\frac{z_{1-\alpha} + z_{\code{beta}} \exp{0.25 \hat{\beta}^2}^2}{
P_0 \hat{\beta}^2}$$
For a multivariable model, the value is adjusted by \(R^2\), the correlation
of coeff
with the other predictors in the model:
$$n_m = \frac{n}{1 - R^2}$$
For a binomial coefficient, the calculation uses
\(P_0\), the probability given the null hypothesis and
\(P_a\), the probability given the alternative hypothesis and
and the average probability \(\bar{P} = \frac{P_0 + P_a}{2}\)
The calculation is:
$$n = \frac{(z_{1-\alpha} \sqrt{2 \bar{P} (1 - \bar{P})} +
z_{\code{beta}} \sqrt{P_0(1 - P_0) + P_a(1 - P_a)})^2}{
(P_a + P_0)^2}$$
An alternative given by Whitemore uses \(\hat{P} = P(x=0)\).
The lead term in the equation below is used to correct for
large values of \(\hat{P}\):
$$n = (1 + 2P_0) \frac{(z_{1-\alpha} \sqrt{\frac{1}{1-\hat{P}} + \frac{1}{\hat{P}}} +
z_{\code{beta}} \sqrt{\frac{1}{1-\hat{P}} +
\frac{1}{\hat{P} \exp{\hat{\beta}}}})^2}{
(P_0 \hat{\beta})^2}$$
As above these can be adjusted in the multivariable case:
$$n_m = \frac{n}{1 - R^2}$$
In this case, Pearsons \(R^2\) correlation is between the
fitted values from a logistic regression with coeff
as the response
and the other predictors as co-variates.
The calculation uses \(\bar{P}\), the mean probability (mean of the
fitted values from the model):
$$R^2 = \frac{(\sum{i=1}^n (y_i - \bar{P})(P_i - \bar{P}))^2}{
\sum{i=1}^n (y_i - \bar{P})^2 \sum{i=1}^n (P_i - \bar{P})^2}$$