phmm: Proportional Hazards Model with Mixed Effects

Description

Fits a proportional hazards regression model incorporating random effects. The function implements an EM algorithm using Markov Chain Monte Carlo (MCMC) at the E-step as described in Vaida and Xu (2000).

Usage

phmm(formula, data, subset, na.action = na.fail, 
  Sigma = "identity", varcov = "diagonal", NINIT = 10, 
  VARSTART = 1, MAXSTEP = 100, CONVERG = 90, Gbs = 100, 
  Gbsvar = 1000, verbose = FALSE, maxtime = 120, random)

Arguments

formula

model formula for the fixed and random components of the model (as in lmer). An intercept is implicitly included in the model by estimation of the error distribution. As a consequence -1<

data

optional data frame in which to interpret the variables occuring in the formulas.

subset

subset of the observations to be used in the fit.

na.action

function to be used to handle any NAs in the data. The user is discouraged to change a default value na.fail.

Sigma

initial covariance matrix for the random effects. Defaults to "identity".

varcov

constraint on Sigma. Currently only "diagonal" is supported.

NINIT

number of starting values supplied to Adaptive Rejection Metropolis Sampling (ARMS) algorithm.

VARSTART

starting value of the variances of the random effects.

MAXSTEP

number of EM iterations.

CONVERG

iteration after which Gibbs sampling size changes from Gbs to Gbsvar.

Gbs

initial Gibbs sampling size (until CONVERG iterations).

Gbsvar

Gibbs sampling size after CONVERG iterations.

verbose

Set to TRUE to print EM steps.

maxtime

maximum time in seconds, before aborting EM iterations. Defaults to 120 seconds.

random

The argument random is no longer used. Random components are are expressed in formula.

Value

The function produces an object of class "phmm" consisting of:
stepsa matrix of estimates at each EM step;
bhatempirical Bayes estimates of expectation of random effects;
sdbhatempirical Bayes estimates of standard deviation of random effects;
coefthe final parameter estimates for the fixed effects;
varthe estimated variance-covariance matrix;
loglika vector of length four with the conditional log-likelihood and marginal log-likelihood estimated by Laplace approximation, reciprocal importance sampling, and bridge sampling (only implemented for nreff < 3);
lambdathe estimated baseline hazard;
Lambdathe estimated cumulative baseline hazard.

Details

The proportional hazards model with mixed effects is equipped to handle clustered survival data. The model generalizes the usual frailty model by allowing log-linearl multivariate random effects. The software can only handle random effects from a multivariate normal distribution. Maximum likelihood estimates of the regression parameters and variance components is gotten by EM algorithm, with Markov chain Monte Carlo (MCMC) used in the E-step.

Care must be taken to ensure the MCMC-EM algorithm has converged, as the algorithm stops after MAXSTEP iterations. No convergence criteria is implemented. It is advised to plot the estimates at each iteration using the plot.phmm method. For more on MCMC-EM convergence see Booth and Hobart (1999).

References

Gilks WR and Wild P. (1992) Adaptive rejection sampling for Gibbs sampling. Applied Statistics 41, pp 337-348.

Donohue, MC, Overholser, R, Xu, R, and Vaida, F (January 01, 2011). Conditional Akaike information under generalized linear and proportional hazards mixed models. Biometrika, 98, 3, 685-700.

Vaida F and Xu R. 2000. "Proportional hazards model with random effects", Statistics in Medicine, 19:3309-3324.

Gamst A, Donohue M, and Xu R (2009). Asymptotic properties and empirical evaluation of the NPMLE in the proportional hazards mixed-effects model. Statistica Sinica, 19, 997.

Xu R, Gamst A, Donohue M, Vaida F, and Harrington DP. 2006. Using Profile Likelihood for Semiparametric Model Selection with Application to Proportional Hazards Mixed Models. Harvard University Biostatistics Working Paper Series, Working Paper 43.

Booth JG and Hobert JP. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. Journal of the Royal Statistical Society, Series B 1999; 61:265-285.

Examples

Run this code

n <- 50      # total sample size
nclust <- 5  # number of clusters
clusters <- rep(1:nclust,each=n/nclust)
beta0 <- c(1,2)
set.seed(13)
#generate phmm data set
Z <- cbind(Z1=sample(0:1,n,replace=TRUE),
           Z2=sample(0:1,n,replace=TRUE),
           Z3=sample(0:1,n,replace=TRUE))
b <- cbind(rep(rnorm(nclust),each=n/nclust),rep(rnorm(nclust),each=n/nclust))
Wb <- matrix(0,n,2)
for( j in 1:2) Wb[,j] <- Z[,j]*b[,j]
Wb <- apply(Wb,1,sum)
T <- -log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb)
C <- runif(n,0,1)
time <- ifelse(T<C,T,C)
event <- ifelse(T<=C,1,0)
mean(event)
phmmd <- data.frame(Z)
phmmd$cluster <- clusters
phmmd$time <- time
phmmd$event <- event

fit.phmm <- phmm(Surv(time, event) ~ Z1 + Z2 + (-1 + Z1 + Z2 | cluster), 
   phmmd, Gbs = 100, Gbsvar = 1000, VARSTART = 1,
   NINIT = 10, MAXSTEP = 100, CONVERG=90)
summary(fit.phmm)
plot(fit.phmm)

Run the code above in your browser using DataLab