This is an implementation of the sample size calculation formula
derived by Hsieh and Lavori (2000)
for the following Cox proportional hazards regression in the epidemiological
studies:
$$h(t|x_1, \boldsymbol{x}_2)=h_0(t)\exp(\beta_1 x_1+\boldsymbol{\beta}_2
\boldsymbol{x}_2,$$
where the covariate \(X_1\) is a nonbinary variable and
\(\boldsymbol{X}_2\) is a vector of other covariates.
Suppose we want to check if
the hazard ratio of the main effect \(X_1=1\) to \(X_1=0\) is equal to
\(1\) or is equal to \(\exp(\beta_1)=\theta\).
Given the type I error rate \(\alpha\) for a two-sided test, the total
number of subjects required to achieve a sample size of \(1-\beta\) is
$$n=\frac{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2}{
[\log(\theta)]^2 \sigma^2 \psi (1-\rho^2)
},$$
where \(z_{a}\) is the \(100 a\)-th percentile of the standard normal distribution, \(\sigma^2=Var(X_1)\), \(\psi\) is the proportion of subjects died of
the disease of interest, and \(\rho\) is the multiple correlation coefficient
of the following linear regression:
$$x_1=b_0+\boldsymbol{b}^T\boldsymbol{x}_2.$$
That is, \(\rho^2=R^2\), where \(R^2\) is the proportion of variance
explained by the regression of \(X_1\) on the vector of covriates
\(\boldsymbol{X}_2\).