These functions output lists of default settings for different rtemis functions. This removes the need of passing named lists of arguments, and provides autocompletion, making it easier to setup functions without having to refer to the manual.
rtset.resample(resampler = "kfold", n.resamples = 10,
stratify.var = NULL, train.p = 0.75, strat.n.bins = 4,
target.length = NULL, seed = NULL, verbose = TRUE)rtset.grid.resample(resampler = "strat.boot", n.resamples = 10,
stratify.var = NULL, train.p = 0.75, strat.n.bins = 4,
target.length = NULL, verbose = TRUE)
rtset.bag.resample(resampler = "strat.sub", n.resamples = 10,
stratify.var = NULL, train.p = 0.75, strat.n.bins = 4,
target.length = NULL, verbose = TRUE)
rtset.meta.resample(resampler = "strat.sub", n.resamples = 4,
stratify.var = NULL, train.p = 0.75, strat.n.bins = 4,
target.length = NULL, verbose = TRUE)
rtset.cv.resample(resampler = "kfold", n.resamples = 10,
stratify.var = NULL, train.p = 0.75, strat.n.bins = 4,
target.length = NULL, verbose = TRUE)
rtset.cluster(type = "fork", hosts = NULL, n.cores = rtCores, ...)
rtset.color(n = 101, colors = NULL, space = "rgb", lo = "#01256E",
lomid = NULL, mid = "white", midhi = NULL, hi = "#95001A",
colorbar = FALSE, cb.mar = c(1, 1, 1, 1), ...)
rtset.preprocess(completeCases = FALSE, impute = FALSE,
impute.type = "missForest", impute.niter = 10, impute.ntree = 500,
impute.discrete = getMode, impute.numeric = mean,
removeCases.thres = NULL, removeFeatures.thres = NULL,
integer2factor = FALSE, nonzeroFactors = FALSE, scale = FALSE,
center = FALSE, removeConstant = TRUE, oneHot = FALSE)
rtset.decompose(decom = "ICA", k = 2, ...)
rtset.ADDT(max.depth = 2, learning.rate = 1, lin.type = "glmnet",
alpha = 0, lambda = 0.1, minobsinnode = 2, minobsinnode.lin = 20,
...)
rtset.GBM(interaction.depth = 2, shrinkage = 0.001, max.trees = 5000,
min.trees = 100, bag.fraction = 0.9, n.minobsinnode = 5,
grid.resample.rtset = rtset.resample("kfold", 5), ipw = TRUE,
upsample = FALSE, upsample.seed = NULL, ...)
rtset.RANGER(n.trees = 1000, min.node.size = 1, mtry = NULL,
grid.resample.rtset = rtset.resample("kfold", 5), ipw = TRUE,
upsample = FALSE, upsample.seed = NULL, ...)
rtset.DN(hidden = 1, activation = NULL, learning.rate = 0.8,
momentum = 0.5, learningrate_scale = 1, output = NULL,
numepochs = 100, batchsize = NULL, hidden_dropout = 0,
visible_dropout = 0, ...)
rtset.MXN(n.hidden.nodes = NULL, output = NULL, activation = "relu",
ctx = mxnet::mx.cpu(), optimizer = "sgd",
initializer = mxnet::mx.init.Xavier(), batch.size = NULL,
momentum = 0.9, max.epochs = 2000, min.epochs = 25,
early.stop = "train", early.stop.n.steps = NULL,
early.stop.relativeVariance.threshold = NULL, learning.rate = NULL,
dropout = 0, dropout.before = 1, dropout.after = 0,
eval.metric = NULL, arg.params = NULL, mx.seed = NULL)
rtset.lincoef(method = c("glmnet", "cv.glmnet", "lm.ridge", "allSubsets",
"forwardStepwise", "backwardStepwise", "glm", "sgd", "solve"),
alpha = 0, lambda = 0.01, lambda.seq = NULL,
cv.glmnet.nfolds = 5, which.cv.glmnet.lambda = c("lambda.min",
"lambda.1se"), nbest = 1, nvmax = 8, sgd.model = "glm",
sgd.model.control = list(lambda1 = 0, lambda2 = 0),
sgd.control = list(method = "ai-sgd"))
rtset.MARS(hidden = 1, activation = NULL, learning.rate = 0.8,
momentum = 0.5, learningrate_scale = 1, output = NULL,
numepochs = 100, batchsize = NULL, hidden_dropout = 0,
visible_dropout = 0, ...)
String: Type of resampling to perform: "bootstrap", "kfold", "strat.boot", "strat.sub".
Default = "strat.boot" for length(y) < 200
, otherwise "strat.sub"
Integer: Number of training/testing sets required
Numeric vector (optional): Variable used for stratification. Defaults to y
Float (0, 1): Fraction of cases to assign to traininig set for resampler = "strat.sub"
Integer: Number of groups to use for stratification for
resampler = "strat.sub" / "strat.boot"
Integer: Number of cases for training set for resampler = "strat.boot"
.
Default = length(y)
Integer: (Optional) Set seed for random number generator, in order to make output reproducible.
See ?base::set.seed
Logical: If TRUE, print messages to screen
String: "fork", "psock"
Vector of strings: For type = "psock": Host names on which to run (macOS, Linux, Windows)
Integer: Number of cores to use on localhost
for type = "fork" (macOS, Linux only)
rtset.cluster
: Additional argument to be passed to parallel::makePSOCKcluster
Integer: How many distinct colors you want. If not odd, converted to n + 1
Defaults to 21
String: Acts as a shortcut to defining lo
, mid
, etc for a number of defaults:
"french", "penn", "grnblkred",
String: Which colorspace to use. Option: "rgb", or "Lab". Default = "rgb".
Recommendation: If mid
is "white" or "black" (default), use "rgb", otherwise "Lab"
Color for low end
Color for low-mid
Color for middle of the range or "mean", which will result in colorOp(c(lo, hi), "mean")
.
If mid = NA
, then only lo
and hi
are used to create the color gradient.
Color for middle-high
Color for high end
Logical: Create a vertical colorbar
Vector, length 4: Colorbar margins. Default: c(1, 1, 1, 1)
String: Name of decomposer to use. Default = "ICA"
Integer: Number of dimensions to project to. Default = 2
Integer: Max depth of additive tree
Float: learning rate
Float: alpha
for method = glmnet
or cv.glmnet
. Default = 0
Float: lambda parameter for MASS::lm.ridge
Default = .01
Integer: Minimum N observations needed in node, before considering splitting
[gS] Integer: Interaction depth
[gS] Float: Shrinkage (learning rate)
[gS] Float (0, 1): Fraction of cases to use to train each tree. Helps avoid overfitting. Default = .75
[gS] Integer: Minimum number of observation allowed in node
List: Output of rtset.resample defining gridSearchLearn parameters.
Default = rtset.resample("kfold", 5)
Logical: If TRUE, apply inverse probability weighting (for Classification only).
Note: If weights
are provided, ipw
is not used. Default = TRUE
Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Caution: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
Integer: Initial number of trees to fit
[gS] Integer: Minimum node size
[gS] Integer: Number of features sampled randomly at each split. Defaults to square root of n of features for classification, and a third of n of features for regression.
String vector: Activation types to use: 'relu', 'sigmoid', 'softrelu', 'tanh'.
If length < n of hidden layers, elements are recycled. See mxnet::mx.symbol.Activation
String: "Logistic" for binary classification, "Softmax" for classification of 2 or more classes, "Linear" for Regression. Defaults to "Logistic" for binary outcome, "Softmax" for 3+ classes, "LinearReg" for regression.
Integer vector: Length must be equal to the number of hidden layers you wish to create
MXNET context: mxnet::mx.cpu()
to use CPU(s). Define N of cores using n.cores
argument.
mxnet::mx.gpu()
to use GPU. For multiple GPUs, provide list like such:
ctx = list(mxnet::mx.gpu(0), mxnet::mx.gpu(1)
to use two GPUs.
Integer: Number of iterations for training.
Float (0, 1): Probability of dropping nodes
Integer: Index of hidden layer before which dropout should be applied
Integer: Index of hidden layer after which dropout should be applied
String: Metrix used for evaluation during train. Default: "rmse"
String: Method to use:
"glm": uses stats::lm.wfit
;
"glmnet": uses glmnet::glmnet
;
"cv.glmnet": uses glmnet:cv.glmnet
;
"lm.ridge": uses MASS::lm.ridge
;
"allsubsets": uses leaps::regsubsets
with method = "exhaustive"
;
"forwardStepwise": uses leaps::regsubsets
with method = "forward};
"backwardStepwise": uses \code{leaps::regsubsets} with \code{method = "backward
;
"sgd": uses sgd::sgd
"solve": uses base::solve
Float, vector: lambda sequence for glmnet
and cv.glmnet
. Default = NULL
Integer: Number of folds for cv.glmnet
String: Whitch lambda to pick from cv.glmnet: "lambda.min": Lambda that gives minimum cross-validated error;
Integer: For method = "allSubsets"
, number of subsets of each size to record. Default = 1
Integer: For method = "allSubsets"
, maximum number of subsets to examine.
String: Model to use for method = "sgd"
. Default = "glm"
List: model.control
list to pass to sgd::sgd
List: sgd.control
list to pass to sgd::sgd
String: Name of decomposer to use. Default = "ICA"
Integer: Number of dimensions to project to. Default = 2
List with parameters