reduced.sample(nco, cen, ncc, show=FALSE, uppercen=0)
show = FALSE
, a numeric vector giving the values of
the reduced sample estimator.
If show=TRUE
, a list with three components which are
vectors of equal length,Suppose $T_i$ are the survival times of individuals $i=1,\ldots,M$ with unknown distribution function $F(t)$ which we wish to estimate. Suppose these times are right-censored by random censoring times $C_i$. Thus the observations consist of right-censored survival times $\tilde T_i = \min(T_i,C_i)$ and non-censoring indicators $D_i = 1{T_i \le C_i}$ for each $i$.
If the number of observations $M$ is large, it is efficient to
use histograms.
Form the histogram cen
of all censoring times $C_i$.
That is, obs[k]
counts the number of values
$C_i$ in the interval
(breaks[k],breaks[k+1]]
for $k > 1$
and [breaks[1],breaks[2]]
for $k = 1$.
Also form the histogram nco
of all uncensored times,
i.e. those $\tilde T_i$ such that $D_i=1$,
and the histogram of all censoring times for which the survival time
is uncensored,
i.e. those $C_i$ such that $D_i=1$.
These three histograms are the arguments passed to kaplan.meier
.
The return value rs
is the reduced-sample estimator
of the distribution function $F(t)$. Specifically,
rs[k]
is the reduced sample estimate of F(breaks[k+1])
.
The value is exact, i.e. the use of histograms does not introduce any
approximation error.
Note that, for the results to be valid, either the histogram breaks
must span the censoring times, or the number of censoring times
that do not fall in a histogram cell must have been counted in
uppercen
.
kaplan.meier
,
km.rs