Fast OpenMP computing of Breiman's random forest for a variety of data settings including right-censored survival, regression, and classification.
RFSRCModel(
ntree = 1000,
mtry = integer(),
nodesize = integer(),
nodedepth = integer(),
splitrule = character(),
nsplit = 10,
block.size = integer(),
samptype = c("swor", "swr"),
membership = FALSE,
sampsize = if (samptype == "swor") function(x) 0.632 * x else function(x) x,
nimpute = 1,
ntime = integer(),
proximity = c(FALSE, TRUE, "inbag", "oob", "all"),
distance = c(FALSE, TRUE, "inbag", "oob", "all"),
forest.wt = c(FALSE, TRUE, "inbag", "oob", "all"),
xvar.wt = numeric(),
split.wt = numeric(),
var.used = c(FALSE, "all.trees", "by.tree"),
split.depth = c(FALSE, "all.trees", "by.tree"),
do.trace = FALSE,
statistics = FALSE
)RFSRCFastModel(
ntree = 500,
sampsize = function(x) min(0.632 * x, max(x^0.75, 150)),
ntime = 50,
terminal.qualts = FALSE,
...
)
MLModel
class object.
number of trees.
number of variables randomly selected as candidates for splitting a node.
minumum size of terminal nodes.
maximum depth to which a tree should be grown.
splitting rule (see rfsrc
).
non-negative integer value for number of random splits to consider for each candidate splitting variable.
interval number of trees at which to compute the cumulative error rate.
whether bootstrap sampling is with or without replacement.
logical indicating whether to return terminal node membership.
function specifying the bootstrap size.
number of iterations of the missing data imputation algorithm.
integer number of time points to constrain ensemble calculations for survival outcomes.
whether and how to return proximity of cases as measured by the frequency of sharing the same terminal nodes.
whether and how to return distance between cases as measured by the ratio of the sum of edges from each case to the root node.
whether and how to return the forest weight matrix.
vector of non-negative weights representing the probability of selecting a variable for splitting.
vector of non-negative weights used for multiplying the split statistic for a variable.
whether and how to return variables used for splitting.
whether and how to return minimal depth for each variable.
number of seconds between updates to the user on approximate time to completion.
logical indicating whether to return split statistics.
logical indicating whether to return terminal node membership information.
arguments passed to RFSRCModel
.
factor
, matrix
, numeric
,
Surv
mtry
, nodesize
Default argument values and further model details can be found in the source See Also links below.
In calls to varimp
for RFSRCModel
, argument
type
may be specified as "anti"
(default) for cases assigned to
the split opposite of the random assignments, as "permute"
for
permutation of OOB cases, or as "random"
for permutation replaced with
random assignment. Variable importance is automatically scaled to range from
0 to 100. To obtain unscaled importance values, set scale = FALSE
.
See example below.
rfsrc
,
rfsrc.fast
, fit
,
resample
# \donttest{
## Requires prior installation of suggested package randomForestSRC to run
model_fit <- fit(sale_amount ~ ., data = ICHomes, model = RFSRCModel)
varimp(model_fit, method = "model", type = "random", scale = TRUE)
# }
Run the code above in your browser using DataLab