Learn R Programming

rtemis (version 0.79)

s.IRF: Iterative Random Forest [C, R]

Description

Train iterative Random Forests for regression or classification using iRF

Usage

s.IRF(x, y = NULL, x.test = NULL, y.test = NULL, x.name = NULL,
  y.name = NULL, n.trees = 1000, n.iter = 5, n.bootstrap = 30,
  interactions.return = NULL, classwt = NULL, ipw = TRUE,
  upsample = FALSE, upsample.seed = NULL, autotune = FALSE,
  n.trees.try = 500, stepFactor = 2, mtry = NULL, mtryStart = NULL,
  mtry.select.prob = NULL, proximity = FALSE, importance = TRUE,
  replace = TRUE, min.node.size = 1, strata = NULL,
  sampsize = NULL, tune.do.trace = FALSE, print.tune.plot = FALSE,
  print.plot = TRUE, plot.fitted = NULL, plot.predicted = NULL,
  plot.theme = getOption("rt.fit.theme", "lightgrid"),
  n.cores = rtCores, question = NULL, verbose = TRUE, trace = 0,
  outdir = NULL, save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)

Arguments

x

Numeric vector or matrix / data frame of features i.e. independent variables

y

Numeric vector of outcome, i.e. dependent variable

x.test

Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in x

y.test

Numeric vector of testing set outcome

x.name

Character: Name for feature set

y.name

Character: Name for outcome

n.trees

Integer: Number of trees to grow. Default = 1000

classwt

Vector, Float: Priors of the classes for classification only. Need not add up to 1

ipw

Logical: If TRUE, apply inverse probability weighting (for Classification only). Note: If weights are provided, ipw is not used. Default = TRUE

upsample

Logical: If TRUE, upsample training set cases not belonging in majority outcome group

upsample.seed

Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)

autotune

Logical: If TRUE, use ]coderandomForest::tuneRF to determine mtry

n.trees.try

Integer: Number of trees to train for tuning, if autotune = TRUE

stepFactor

Float: If autotune = TRUE, at each tuning iteration, mtry is multiplied or divided by this value. Default = 1.5

mtry

[gS] Integer: Number of features sampled randomly at each split

mtryStart

Integer: If autotune = TRUE, start at this value for mtry

proximity

Logical: If TRUE, calculate proximity measure among cases. Default = FALSE

importance

Logical: If TRUE, estimate variable relative importance. Default = TRUE

replace

Logical: If TRUE, sample cases with replacement during training. Default = TRUE

strata

Vector, Factor: Will be used for stratified sampling

sampsize

Integer: Size of sample to draw. In Classification, if strata is defined, this can be a vector of the same length, in which case, corresponding values determine how many cases are drawn from the strata.

tune.do.trace

Same as do.trace but for tuning, if autotune = TRUE

print.tune.plot

Logical: passed to randomForest::tuneRF. Default = FALSE

print.plot

Logical: if TRUE, produce plot using mplot3 Takes precedence over plot.fitted and plot.predicted

plot.fitted

Logical: if TRUE, plot True (y) vs Fitted

plot.predicted

Logical: if TRUE, plot True (y.test) vs Predicted. Requires x.test and y.test

plot.theme

String: "zero", "dark", "box", "darkbox"

n.cores

Integer: Number of cores to use. Defaults to available cores reported by future::availableCores(), unles option rt.cores is set at the time the library is loaded

question

String: the question you are attempting to answer with this model, in plain language.

verbose

Logical: If TRUE, print summary to screen.

outdir

String, Optional: Path to directory to save output

save.mod

Logical. If TRUE, save all output as RDS file in outdir save.mod is TRUE by default if an outdir is defined. If set to TRUE, and no outdir is defined, outdir defaults to paste0("./s.", mod.name)

...

Additional arguments to be passed to iRF::iRF

Value

rtMod object

Details

If autotue = TRUE, iRF::tuneRF will be run to determine best mtry value.

See Also

elevate for external cross-validation

Other Supervised Learning: s.ADABOOST, s.ADDTREE, s.BART, s.BAYESGLM, s.BRUTO, s.C50, s.CART, s.CTREE, s.DA, s.ET, s.EVTREE, s.GAM.default, s.GAM.formula, s.GAMSEL, s.GAM, s.GBM3, s.GBM, s.GLMNET, s.GLM, s.GLS, s.H2ODL, s.H2OGBM, s.H2ORF, s.KNN, s.LDA, s.LM, s.MARS, s.MLRF, s.MXN, s.NBAYES, s.NLA, s.NLS, s.NW, s.POLYMARS, s.PPR, s.PPTREE, s.QDA, s.QRNN, s.RANGER, s.RFSRC, s.RF, s.SGD, s.SPLS, s.SVM, s.TFN, s.XGBLIN, s.XGB

Other Tree-based methods: s.ADABOOST, s.ADDTREE, s.BART, s.C50, s.CART, s.CTREE, s.ET, s.EVTREE, s.GBM3, s.GBM, s.H2OGBM, s.H2ORF, s.MLRF, s.PPTREE, s.RANGER, s.RFSRC, s.RF, s.XGB