a data frame with two extra attributes tm.retain
and
tcount
.
The first contains the names of the key variables, and which names
correspond to tdc or event variables.
The tcount variable contains counts of the match types.
New time values that occur before the first interval for a subject
are "early", those after the last interval for a subject are "late",
and those that fall into a gap are of type "gap".
All these are are considered to be outside the specified time frame for the
given subject. An event of this type will be discarded.
An observation in data2
whose identifier matches no rows in
data1
is of type "missid" and is also discarded.
A time-dependent covariate value will be applied to later intervals but
will not generate a new time point in the output.
The most common type will usually be "within", corresponding to
those new times that
fall inside an existing interval and cause it to be split into two.
Observations that fall exactly on the edge of an interval but within the
(min, max] time for a subject are counted
as being on a "leading" edge, "trailing" edge or "boundary".
The first corresponds for instance
to an occurrence at 17 for someone with an intervals of (0,15] and (17, 35].
A tdc
at time 17 will affect this interval
but an event
at 17 would be ignored. An event
occurrence at 15 would count in the (0,15] interval.
The last case is where the main data set has touching
intervals for a subject, e.g. (17, 28] and (28,35] and a new occurrence
lands at the join. Events will go to the earlier interval and counts
to the latter one. A last column shows the number of additions
where the id and time point were identical.
When this occurs, the tdc
and event
operators will use
the final value in the data (last edit wins), but ignoring missing,
while cumtdc
and cumevent
operators add up the values.
These extra attributes are ephemeral and will be discarded
if the dataframe is modified. This is intentional, since they will
become invalid if for instance a subset were selected.