This function is used to get the data.frame
for e.g. a glm
fit that is comparable to a ddhazard
fit in the sense that it is a static version. For example, say that we bin our time periods into (0,1]
, (1,2]
and (2,3]
. Next, consider an individual who dies at time 2.5. He should be a control in the the first two bins and should be a case in the last bin. Thus the rows in the final data frame for this individual is c(Y = 1, ..., weights = 1)
and c(Y = 0, ..., weights = 2)
where Y
is the outcome, ...
is the covariates and weights
is the weights for the regression. Consider another individual who does not die and we observe him for all three periods. Thus, he will yield one row with c(Y = 0, ..., weights = 3)
.
This function use similar logic as the ddhazard
for individuals with time varying covariates (see the vignette vignette("ddhazard", "dynamichazard")
for details).
If use_weights = FALSE
then the two previously mentioned individuals will yield three rows each. The first individual will have c(Y = 0, t = 1, ..., weights = 1)
, c(Y = 0, t = 2, ..., weights = 1)
, c(Y = 1, t = 3, ..., weights = 1)
while the latter will have three rows c(Y = 0, t = 1, ..., weights = 1)
, c(Y = 0, t = 2, ..., weights = 1)
, c(Y = 0, t = 3, ..., weights = 1)
. This kind of data frame is useful if you want to make a fit with e.g. gam
function in the mgcv
package as described en Tutz et. al (2016).