Sample size calculation testing interaction effect for Cox proportional hazards regression with two covariates for Epidemiological Studies. Both covariates should be binary variables. The formula takes into account the correlation between the two covariates.
ssizeEpiInt(X1,
X2,
failureFlag,
power,
theta,
alpha = 0.05)
numeric. a nPilot
by 1 vector, where nPilot
is the number of subjects
in the pilot data set. This vector records the values of the covariate of
interest for the nPilot
subjects in the pilot study. X1
should
be binary and take only two possible values: zero and one.
numeric. a nPilot
by 1 vector, where nPilot
is the number of subjects
in the pilot study. This vector records the values of the second covariate
for the nPilot
subjects in the pilot study.
X2
should be binary and take only two possible values: zero and one.
numeric. a nPilot
by 1 vector of indicators indicating if a subject is
failure (failureFlag=1
) or alive (failureFlag=0
).
numeric. postulated power.
numeric. postulated hazard ratio.
numeric. type I error rate.
the total number of subjects required.
estimated \(Pr(X_1=1)\)
estimated \(Pr(X_2=1)\)
estimated \(Pr(X_1=1 | X_2=0)\)
estimated \(Pr(X_1=1 | X_2=1)\)
square of the estimated \(corr(X_1, X_2)\)
a factor adjusting the sample size. The sample size needed to
detect an effect of a prognostic factor with given error probabilities has
to be multiplied by the factor G
when an interaction of the
same magnitude is to be detected.
estimated number of subjects taking values \(X_1=0\) and \(X_2=0\).
estimated number of subjects taking values \(X_1=0\) and \(X_2=1\).
estimated number of subjects taking values \(X_1=1\) and \(X_2=0\).
estimated number of subjects taking values \(X_1=1\) and \(X_2=1\).
proportion of subjects died of the disease of interest.
This is an implementation of the sample size calculation formula derived by Schmoor et al. (2000) for the following Cox proportional hazards regression in the epidemoilogical studies: $$h(t|x_1, x_2)=h_0(t)\exp(\beta_1 x_1+\beta_2 x_2 + \gamma (x_1 x_2)),$$ where both covariates \(X_1\) and \(X_2\) are binary variables.
Suppose we want to check if the hazard ratio of the interaction effect \(X_1 X_2=1\) to \(X_1 X_2=0\) is equal to \(1\) or is equal to \(\exp(\gamma)=\theta\). Given the type I error rate \(\alpha\) for a two-sided test, the total number of subjects required to achieve the desired power \(1-\beta\) is: $$n=\frac{\left(z_{1-\alpha/2}+z_{1-\beta}\right)^2 G}{ [\log(\theta)]^2 \psi (1-p) p (1-\rho^2) },$$ where \(z_{a}\) is the \(100 a\)-th percentile of the standard normal distribution, \(\psi\) is the proportion of subjects died of the disease of interest, and $$\rho=corr(X_1, X_2)=(p_1-p_0)\times\sqrt{\frac{q(1-q)}{p(1-p)}},$$ and \(p=Pr(X_1=1)\), \(q=Pr(X_2=1)\), \(p_0=Pr(X_1=1|X_2=0)\), and \(p_1=Pr(X_1=1 | X_2=1)\), and $$G=\frac{[(1-q)(1-p_0)p_0+q(1-p_1)p_1]^2}{(1-q)q (1-p_0)p_0 (1-p_1) p_1},$$ and \(p0=Pr(X_1=1 | X_2=0)=myc/(mya+myc)\), \(p1=Pr(X_1=1 | X_2=1)=myd/(myb+myd)\), \(p=Pr(X_1=1)=(myc+myd)/n\), \(q=Pr(X_2=1)=(myb+myd)/n\), \(n=mya+myb+myc+myd\).
\(p_{00}=Pr(X_1=0,\mbox{and}, X_2=0)\), \(p_{01}=Pr(X_1=0,\mbox{and}, X_2=1)\), \(p_{10}=Pr(X_1=1,\mbox{and}, X_2=0)\), \(p_{11}=Pr(X_1=1,\mbox{and}, X_2=1)\).
\(p_{00}\), \(p_{01}\), \(p_{10}\), \(p_{11}\), and \(\psi\) will be estimated from the pilot data.
Schmoor C., Sauerbrei W., and Schumacher M. (2000). Sample size considerations for the evaluation of prognostic factors in survival analysis. Statistics in Medicine. 19:441-452.
# NOT RUN {
# generate a toy pilot data set
X1 <- c(rep(1, 39), rep(0, 61))
set.seed(123456)
X2 <- sample(c(0, 1), 100, replace = TRUE)
failureFlag <- sample(c(0, 1), 100, prob = c(0.25, 0.75), replace = TRUE)
ssizeEpiInt(X1 = X1,
X2 = X2,
failureFlag = failureFlag,
power = 0.88,
theta = 3,
alpha = 0.05)
# }
Run the code above in your browser using DataLab