flexsurvspline(formula, data, weights, bhazard, subset, k=0, knots=NULL, bknots=NULL, scale="hazard", timescale="log",...)
Surv
function, and any covariates are given on the
right-hand side. For example, Surv(time, dead) ~ age + sex
specifies a model where the log cumulative hazard (by default, see scale
) is a
linear function of the covariates age
and sex
.
If there are no covariates, specify 1
on the right hand side,
for example Surv(time, dead) ~ 1
.
Time-varying covariate effects can be specified using the method
described in flexsurvreg
for placing covariates on
ancillary parameters. The ancillary parameters here
are named gamma1
, ..., gammar
where r
is the
number of knots k
plus one (the ``degrees of freedom'' as
defined by Royston and Parmar). So for the default Weibull model,
there is just one ancillary parameter gamma1
.
Therefore a model with one internal spline knot, where the
equivalents of the Weibull shape and scale parameters, but not the
higher-order term gamma2
, vary with age and sex, can be
specified as:
Surv(time, dead) ~ age + sex + gamma1(age) + gamma1(sex)
or alternatively (and more safely, see flexsurvreg
)
Surv(time, dead) ~ age + sex, anc=list(gamma1=~age + sex)
Surv
objects of type="right"
,"counting"
,
"interval1"
or "interval2"
are supported, corresponding to
right-censored, left-truncated or interval-censored observations.
formula
.
If not given, the variables should be in the working environment.
k=0
gives
a Weibull, log-logistic or lognormal model, if "scale"
is
"hazard"
, "odds"
or "normal"
respectively. k
is equivalent
to df-1
in the notation of stpm
for Stata. The knots
are then chosen as equally-spaced quantiles of the log uncensored
survival times, for example, at the median with one knot,
or at the 33% and 67% quantiles of log time (or time, see
"timescale"
) with two knots.
To override this default knot placement, specify knots
instead.
"timescale"
). If not specified, knot
locations are chosen as described in k
above.
Either k
or knots
must be
specified. If both are specified, knots
overrides k
.
"timescale"
). If not
supplied, these are are chosen as the minimum and maximum log
death time.
"hazard"
, the log cumulative hazard is modelled as a spline
function. If "odds"
, the log cumulative odds is modelled as a spline
function.
If "normal"
, $-InvPhi(S(t))$ is modelled as a spline
function, where $InvPhi()$ is the
inverse normal distribution function qnorm
.
"log"
(the default) the log cumulative hazard (or
alternative) is modelled as a spline function of log time.
If "identity"
, it is modelled as a spline function of time.
flexsurvreg
, for example, anc
, inits
,
fixedpars
, weights
, subset
, na.action
, and
any options to control optimisation. See
flexsurvreg
. "flexsurvreg"
with the same elements as
described in flexsurvreg
, and including extra components
describing the spline model. See in particular:scale
of the model, hazard, odds or normal."gamma..."
, and covariate
effects are labelled with the names of the covariates.Coefficients gamma1,gamma2,...
here are the equivalent of
s0,s1,...
in Stata streg
, and gamma0
is the
equivalent of the xb
constant term. To reproduce results,
use the noorthog
option in Stata, since no orthogonalisation
is performed on the spline basis here.In the Weibull model, for example, gamma0,gamma1
are
-shape*log(scale), shape
respectively in
dweibull
or flexsurvreg
notation, or
(-Intercept/scale
, 1/scale
) in survreg
notation.In the log-logistic model with shape a
and scale b
(as
in dllogis
from the eha package),
1/b^a
is equivalent to exp(gamma0)
, and a
is
equivalent to gamma1
.In the log-normal model with log-scale mean mu
and standard
deviation sigma
, -mu/sigma
is equivalent to
gamma0
and 1/sigma
is equivalent to gamma1
.
flexsurvreg
by
dynamically constructing a custom distribution using
dsurvspline
, psurvspline
and unroll.function
.
In the spline-based survival model of Royston and Parmar (2002),
a transformation $g(S(t,z))$ of the survival function is modelled
as a natural cubic spline function of log time $x = log(t)$
plus linear effects of covariates $z$.$$g(S(t,z)) = s(x, \bm{\gamma}) + \bm{\beta}^T \mathbf{z}$$
The proportional hazards model (scale="hazard"
) defines
$g(S(t,z)) =
log(-log(S(t,z))) = log(H(t,z))$, the log
cumulative hazard.
The proportional odds model (scale="odds"
) defines $g(S(t,z)) = log(1/S(t,z) - 1)$, the log
cumulative odds.
The probit model (scale="normal"
) defines $g(S(t,z)) = -InvPhi(S(t,z))$,
where $InvPhi()$ is the
inverse normal distribution function qnorm
.
With no knots, the spline reduces to a linear function, and these models are equivalent to Weibull, log-logistic and lognormal models respectively.
The spline coefficients $gamma_j: j=1, 2 \ldots$, which are called the "ancillary parameters" above, may also be modelled as linear functions of covariates $\mathbf{z}$, as
$$\gamma_j(\mathbf{z}) = \gamma_{j0} + \gamma_{j1}z_1 + \gamma_{j2}z_2 + ... $$
giving a model where the effects of covariates are arbitrarily flexible functions of time: a non-proportional hazards or odds model.
Natural cubic splines are cubic splines constrained to be linear beyond boundary knots $kmin,kmax$. The spline function is defined as
$$s(x,\bm{\gamma}) = \gamma_0 + \gamma_1 x + \gamma_2 v_1(x) + \ldots + \gamma_{m+1} v_m(x)$$
where $vj(x)$ is the $j$th basis function
$$v_j(x) = (x - k_j)^3_+ - \lambda_j(x - k_{min})^3_+ - (1 - \lambda_j) (x - k_{max})^3_+$$
$$\lambda_j = \frac{k_{max} - k_j}{k_{max} - k_{min}}$$
and $(x - a)_+ = max(0, x - a)$.
Jackson, C. (2016). flexsurv: A Platform for Parametric Survival Modeling in R. Journal of Statistical Software, 70(8), 1-33. doi:10.18637/jss.v070.i08
flexsurvreg
for flexible survival modelling using
general parametric distributions. plot.flexsurvreg
and lines.flexsurvreg
to
plot fitted survival, hazards and cumulative hazards from models fitted
by flexsurvspline
and flexsurvreg
.
## Best-fitting model to breast cancer data from Royston and Parmar (2002)
## One internal knot (2 df) and cumulative odds scale
spl <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=1, scale="odds")
## Fitted survival
plot(spl, lwd=3, ci=FALSE)
## Simple Weibull model fits much less well
splw <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=0, scale="hazard")
lines(splw, col="blue", ci=FALSE)
## Alternative way of fitting the Weibull
## Not run:
# splw2 <- flexsurvreg(Surv(recyrs, censrec) ~ group, data=bc, dist="weibull")
# ## End(Not run)
Run the code above in your browser using DataLab