Each interval in the original data is cut at the given points; if
an original row were (15, 60] with a cut vector of (10,30, 40) the
resulting data set would have intervals of (15,30], (30,40] and
(40, 60].
Each row in the final data set will lie completely within one of the
cut intervals. Which interval for each row of the output is shown by the
episode
variable, where 1= less than the first cutpoint, 2=
between the first and the second, etc.
For the example above the values would be 2, 3, and 4.
The routine is called with a formula as the first
argument.
The right hand side of the formula can be used to delimit variables
that should be retained; normally one will use ~ .
as a
shorthand to retain them all. The routine
will try to retain variable names, e.g. Surv(adam, joe, fred)~.
will result in a data set with those same variable names for
tstart
, end
, and event
options rather than
the defaults. Any user specified values for these options will be
used if they are present, of course.
However, the routine is not sophisticated; it only does this
substitution for simple names. A call of Surv(time, stat==2)
for instance will not retain "stat" as the name of the event variable.
Rows of data with a missing time or status are copied across
unchanged, unless the na.action argument is changed from its default
value of na.pass
. But in the latter case any row
that is missing for any variable will be removed, which is rarely
what is desired.