kaplan.meier(obs, nco, breaks, upperobs=0)
Suppose $T[i]$ are the survival times of individuals $i=1,\ldots,M$ with unknown distribution function $F(t)$ which we wish to estimate. Suppose these times are right-censored by random censoring times $C[i]$. Thus the observations consist of right-censored survival times $T*[i] = min(T[i],C[i])$ and non-censoring indicators $D[i] = 1(T[i] <= c[i])$="" for="" each="" $i$.<="" p="">
If the number of observations $M$ is large, it is efficient to
use histograms.
Form the histogram obs
of all observed times $T*[i]$.
That is, obs[k]
counts the number of values
$T*[i]$ in the interval
(breaks[k],breaks[k+1]]
for $k > 1$
and [breaks[1],breaks[2]]
for $k = 1$.
Also form the histogram nco
of all uncensored times,
i.e. those $T*[i]$ such that $D[i]=1$.
These two histograms are the arguments passed to kaplan.meier
.
The vectors km
and lambda
returned by kaplan.meier
are (histogram approximations to) the Kaplan-Meier estimator
of $F(t)$ and its hazard rate $lambda(t)$.
Specifically, km[k]
is an estimate of
F(breaks[k+1])
, and lambda[k]
is an estimate of
the average of $lambda(t)$ over the interval
(breaks[k],breaks[k+1])
.
The histogram breaks must include $0$.
If the histogram breaks do not span the range of the observations,
it is important to count how many survival times
$T*[i]$ exceed the rightmost breakpoint,
and give this as the value upperobs
.
reduced.sample
,
km.rs