mtry
to the optimal value with respect to out-of-bag error for a LTRCCIF modelStarting with the default value of mtry
, search for the optimal value
(with respect to out-of-bag error estimate) of mtry
for ltrccif
.
tune.ltrccif(
formula,
data,
id,
mtryStart = NULL,
stepFactor = 2,
time.eval = NULL,
time.tau = NULL,
ntreeTry = 100L,
bootstrap = c("by.sub", "by.root", "none", "by.user"),
samptype = c("swor", "swr"),
sampfrac = 0.632,
samp = NULL,
na.action = "na.omit",
trace = TRUE,
doBest = FALSE,
plot = FALSE,
applyfun = NULL,
cores = NULL,
control = partykit::ctree_control(teststat = "quad", testtype = "Univ", mincriterion =
0, saveinfo = FALSE, minsplit = max(ceiling(sqrt(nrow(data))), 20), minbucket =
max(ceiling(sqrt(nrow(data))), 7), minprob = 0.01)
)
If doBest = FALSE
(default), this returns the optimal mtry value of those searched.
If doBest = TRUE
, this returns the ltrccif
object produced with the optimal mtry
.
a formula object, with the response being a Surv
object, with form
Surv(tleft, tright, event)
.
a data frame containing n
rows of
left-truncated right-censored observations.
variable name of subject identifiers. If this is present, it will be
searched for in the data
data frame. Each group of rows in data
with the same subject id
represents the covariate path through time of
a single subject. If not specified, the algorithm then assumes data
contains left-truncated and right-censored survival data with time-invariant
covariates.
starting value of mtry
; default is sqrt(nvar)
.
at each iteration, mtry
is inflated (or deflated)
by this value. The default value is 2
.
a vector of time points, at which the estimated survival probabilities are evaluated.
an optional vector, with the i-th entry giving the upper time limit for the
computed survival probabilities for the i-th data (i.e., only computes
survival probabilies at time.eval[time.eval <= time.tau[i]]
for the i-th
data of interest).
number of trees used at the tuning step.
bootstrap protocol.
(1) If id
is present,
the choices are: "by.sub"
(by default) which bootstraps subjects,
"by.root"
which bootstraps pseudo-subjects.
Both can be with or without replacement (by default sampling is without
replacement; see the option perturb
below);
(2) If id
is not specified, it bootstraps the data
by
sampling with or without replacement.
Regardless of the presence of id
, if "none"
is chosen,
data
is not bootstrapped at all, and is used in
every individual tree. If "by.user"
is choosen,
the bootstrap specified by samp
is used.
choices are swor
(sampling without replacement) and
swr
(sampling with replacement). The default action here is sampling
without replacement.
a fraction, determining the proportion of subjects to draw
without replacement when samptype = "swor"
. The default value is 0.632
.
To be more specific, if id
is present, 0.632 * N
of subjects with their
pseudo-subject observations are drawn without replacement (N
denotes the
number of subjects); otherwise, 0.632 * n
is the requested size
of the sample.
Bootstrap specification when bootstype = "by.user"
.
Array of dim n x ntree
specifying how many times each record appears
in each bootstrap sample.
action taken if the data contains NA
’s. The default
"na.omit"
removes the entire record if any of its entries is
NA
(for x-variables this applies only to those specifically listed
in formula
). See function cforest
for
other available options.
whether to print the progress of the search. trace = TRUE
is set by default.
whether to run a ltrccif
object using the optimal mtry
found.
doBest = FALSE
is set by default.
whether to plot the out-of-bag error as a function of mtry
.
plot = FALSE
is set by default.
an optional lapply
-style function with arguments
function(X, FUN, ...)
.
It is used for computing the variable selection criterion. The default is to use the
basic lapply
function unless the cores
argument is specified (see below).
See ctree_control
.
numeric. See ctree_control
.
a list with control parameters, see cforest
.
The default values correspond to those of the default values used by ltrccif
.
sbrier_ltrc
for evaluation of model fit when searching
for the optimal value of mtry
.
### Example with data pbcsample
library(survival)
Formula = Surv(Start, Stop, Event) ~ age + alk.phos + ast + chol + edema
## mtry tuned by the OOB procedure with stepFactor 3, number of trees built 10.
mtryT = tune.ltrccif(formula = Formula, data = pbcsample, id = ID, stepFactor = 3,
ntreeTry = 10L)
Run the code above in your browser using DataLab