km.rs: Kaplan-Meier and Reduced Sample Estimator using Histograms

Description

Compute the Kaplan-Meier and Reduced Sample estimators of a survival time distribution function, using histogram techniques

Usage

km.rs(o, cc, d, breaks)

Value

A list with five elements

rs: Reduced-sample estimate of the survival time c.d.f. \(F(t)\)
km: Kaplan-Meier estimate of the survival time c.d.f. \(F(t)\)
hazard: corresponding Nelson-Aalen estimate of the hazard rate \(\lambda(t)\)
r: values of \(t\) for which \(F(t)\) is estimated
breaks: the breakpoints vector

Arguments

o: vector of observed survival times
cc: vector of censoring times
d: vector of non-censoring indicators
breaks: Vector of breakpoints to be used to form histograms.

Author

Adrian Baddeley Adrian.Baddeley@curtin.edu.au

and Rolf Turner rolfturner@posteo.net

Details

This function is needed mainly for internal use in spatstat, but may be useful in other applications where you want to form the Kaplan-Meier estimator from a huge dataset.

Suppose \(T_i\) are the survival times of individuals \(i=1,\ldots,M\) with unknown distribution function \(F(t)\) which we wish to estimate. Suppose these times are right-censored by random censoring times \(C_i\). Thus the observations consist of right-censored survival times \(\tilde T_i = \min(T_i,C_i)\) and non-censoring indicators \(D_i = 1\{T_i \le C_i\}\) for each \(i\).

The arguments to this function are vectors o, cc, d of observed values of \(\tilde T_i\), \(C_i\) and \(D_i\) respectively. The function computes histograms and forms the reduced-sample and Kaplan-Meier estimates of \(F(t)\) by invoking the functions kaplan.meier and reduced.sample. This is efficient if the lengths of o, cc, d (i.e. the number of observations) is large.

The vectors km and hazard returned by kaplan.meier are (histogram approximations to) the Kaplan-Meier estimator of \(F(t)\) and its hazard rate \(\lambda(t)\). Specifically, km[k] is an estimate of F(breaks[k+1]), and lambda[k] is an estimate of the average of \(\lambda(t)\) over the interval (breaks[k],breaks[k+1]). This approximation is exact only if the survival times are discrete and the histogram breaks are fine enough to ensure that each interval (breaks[k],breaks[k+1]) contains only one possible value of the survival time.