Description of the rate tables used by expected survival routines.
A rate table contains event rates per unit time for some particular
endpoint. Death rates are the most common use, the survexp.us
table, for instance, contains death rates for the United States by
year of age, sex, and calendar year.
A rate table is structured as a multi-way array with the following attributes:
the dimensions of the array
a named list of dimnames. The names are used to
match user data to the dimensions, e.g., see the rmap
argument in the pyears
example. If a dimension is
categorical, such as sex
in survexp.us
, then the
dimname itself is matched against user's data values. The matching
ignores case and allows abbreviations, e.g., "M", "Male", and "m"
all successfully match the survexp.us
dimname of
sex=c("male", "female")
.
a vector giving the type of each dimension, which will
be 1= categorical, 2= continuous, 3= date, 4= calendar year of a
US rate table.
If type
is 3 or 4, then the corresponding cutpoints must be
one of the calendar date types: Date, POSIXt, date, or chron.
This allows the code to properly match user data to the ratetable.
(The published US decennial rate tables' definition is that a
subject does not begin to experience a new years' death rate on
Jan 1, but rather on their next birthday. The actual impact of
this delay on any given subjects' calculation is neglible, but the code
has always tried to be correct.)
a list with one elment per dimension. If
type=1
then the corresponding list element should be NULL,
otherwise it should be a vector of length dim[i]
containing
the starting point of the interval to which the corresponding
row/col of the array applies. Cutpoints must be in the same units
as the underlying table, e.g., the survexp.us
table
contains death rates per day, so the age
cutpoint vector contains
age in days while year
contains a vector of Dates.
Cutpoints do not need to be evenly
spaced: the survexp.us
table, for instance,
originally had age divided up
as 0-1 days, 1-7 days, 7-28 days, 28 days - 1 year, 2, 3, …
119 years. (Changes in the source of the tables made it difficult
to continue splitting out the first year.)
an optional summarization function. If present, it will be called with a numeric matrix that has one column per dimension and one row per observation. The function returns a character string giving a summary of the data. This is used by some routines to print an informative message, and provides one way to inform users of a data mistake, e.g., if the printout states that all subjects are between 0.14 and 0.23 years old it is likely that the user's age variable was in years when it should have been in days.
optional attribute containing the names of the dimnames. If the dimnames list itself has names, this attribute will be ignored.