survfit(formula, data, weights, subset, na.action,
newdata, individual=F, conf.int=.95, se.fit=T,
type=c("kaplan-meier","fleming-harrington", "fh2"),
error=c("greenwood","tsiatis"),
conf.type=c("log","log-log","plain","none"),
conf.lower=c("usual", "peto", "modified"))
basehaz(fit,centered=TRUE)
coxph
object.
If a formula object is supplied it must have a Surv
object as the
response on the left of the ~
operator and, if desired, terms
separated by + operators on the right.
One of the subset
and the weights
argument.subset
argument.subset
argument has been used.
Default is options()$na.action
.coxph
formula. Only applicable when formula
is a coxph
object.
The curve(s) produced will be representative of a cohort who's
covariates correspoTRUE
."kaplan-meier"
, "fleming-harrington"
or "fh2"
if a formula is given
and "aalen"
or "kaplan-meier"
if the first a"greenwood"
for the Greenwood formula or
"tsiatis"
for the Tsiatis formula, (only the first character is
necessary). The default is "tsiatis"
when a coxph
object is
given, and it is "none"
, "plain"
, "log"
(the default), or "log-log"
. Only
enough of the string to uniquely identify it is necessary.
The first option causes confidence intervals not to be
generated. The second ccoxph
objectsurvfit
object; see the help on survfit.object
for
details. Methods defined for survfit
objects are provided for
print
, plot
, lines
, and points
.For basehaz
, a dataframe with the baseline hazard, times, and strata.
exp(sum(coef*(x-center)))
are used,
ignoring any value for weights
input by the user. There is also an extra
term in the variance of the curve, due to the variance ofthe coefficients and
hence variance in the computed weights.
The Greenwood formula for the variance is a sum of terms
d/(n*(n-m)), where d is the number of deaths at a given time point, n
is the sum of weights
for all individuals still at risk at that time, and
m is the sum of weights
for the deaths at that time. The
justification is based on a binomial argument when weights are all
equal to one; extension to the weighted case is ad hoc. Tsiatis
(1981) proposes a sum of terms d/(n*n), based on a counting process
argument which includes the weighted case.
The two variants of the F-H estimate have to do with how ties are handled.
If there were 3 deaths out of 10 at risk, then the first would increment
the hazard by 3/10 and the second by 1/10 + 1/9 + 1/8. For curves created
after a Cox model these correspond to the Breslow and Efron estimates,
respectively, and the proper choice is made automatically.
The fh2
method will give results closer to the Kaplan-Meier.
Based on the work of Link (1984), the log transform is expected to produce the most accurate confidence intervals. If there is heavy censoring, then based on the work of Dorey and Korn (1987) the modified estimate will give a more reliable confidence band for the tails of the curve.
Fleming, T. H. and Harrington, D.P. (1984). Nonparametric estimation of the survival distribution in censored data. Comm. in Statistics 13, 2469-86.
Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.
Link, C. L. (1984). Confidence intervals for the survival function using Cox's proportional hazards model with covariates. Biometrics 40, 601-610.
Tsiatis, A. (1981). A large sample study of the estimate for the integrated hazard function in Cox's regression model for survival data. Annals of Statistics 9, 93-108.
print.survfit
, plot.survfit
, lines.survfit
, summary.survfit
,
coxph
, Surv
, strata
.#fit a Kaplan-Meier and plot it
data(aml)
fit <- survfit(Surv(time, status) ~ x, data=aml)
plot(fit)
# plot only 1 of the 2 curves from above
plot(fit[2])
#fit a cox proportional hazards model and plot the
#predicted survival curve
data(ovarian)
fit <- coxph( Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian)
plot( survfit( fit))
Run the code above in your browser using DataLab