The Cox model is a relative risk model; predictions
of type "linear predictor", "risk", and "terms" are all
relative to the sample from which they came. By default, the reference
value for each of these is the mean covariate within strata. The
primary underlying
reason is statistical: a Cox model only predicts relative risks
between pairs of subjects within the same strata, and hence the addition
of a constant to any covariate, either overall or only within a
particular stratum, has no effect on the fitted results.
Using the reference="strata"
option causes this to be true for
predictions as well.
(There have been occasional requests for reference="zero", i.e., a
hypothetical subject with all covariates equal to zero, in order to
match certain other packages' results.
The issue is that the results are often silly, e.g., risk relative to
a subject with height, weight, or blood pressure of zero.)
When the results of predict
are used in further calculations it
may be desirable to use a fixed reference level.
Use of reference="sample"
will use the overall means, and agrees
with the linear.predictors
component of the coxph object (which
uses the overall mean for backwards compatability with older code).
Predictions of type="terms"
are almost invariably passed
forward to further calculation, so for these we default to using
the sample as the reference.
Predictions of type "expected" incorporate the baseline hazard and are
thus absolute instead of relative; the
reference
option has no effect on these.
These values depend on the follow-up time for the future subjects as
well as covariates so the newdata
argument needs to include both
the right and left hand side variables from the formula.
(The status variable will not be used, but is required since the
underlying code needs to reconstruct the entire formula.)
Models that contain a frailty
term are a special case: due
to the technical difficulty, when there is a newdata
argument the
predictions will always be for a random effect of zero.