mxnet
[C, R]Train a Neural Network using mxnet
with optional early stopping
s.MXN(x, y = NULL, x.test = NULL, y.test = NULL, x.valid = NULL,
y.valid = NULL, upsample = FALSE, upsample.seed = NULL,
net = NULL, n.hidden.nodes = NULL, output = NULL,
ctx = mxnet::mx.cpu(), initializer = mxnet::mx.init.Xavier(),
batch.normalization = TRUE, l2.normalization = FALSE,
activation = "relu", optimizer = "adadelta", batch.size = NULL,
momentum = 0.9, max.epochs = 1000, min.epochs = 25,
early.stop = c("train", "valid"), early.stop.absolute.threshold = NA,
early.stop.relative.threshold = NA,
early.stop.relativeVariance.threshold = NULL,
early.stop.n.steps = NULL, learning.rate = NULL, dropout = 0,
dropout.before = 1, dropout.after = 0, eval.metric = NULL,
minimize = NULL, arg.params = NULL, mx.seed = NULL,
x.name = NULL, y.name = NULL, plot.graphviz = FALSE,
print.plot = TRUE, print.error.plot = NULL, rtlayout.mat = c(2, 1),
plot.fitted = NULL, plot.predicted = NULL,
plot.theme = getOption("rt.fit.theme", "lightgrid"), question = NULL,
verbose = TRUE, verbose.mxnet = TRUE, verbose.checkpoint = FALSE,
outdir = NULL, n.cores = rtCores,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)
Numeric vector or matrix / data frame of features i.e. independent variables
Numeric vector of outcome, i.e. dependent variable
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in x
Numeric vector of testing set outcome
Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Caution: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
MXNET Symbol: provide a previously defined network. logger will not work in this case at the moment, so early stopping cannot be applied
Integer vector: Length must be equal to the number of hidden layers you wish to create
String: "Logistic" for binary classification, "Softmax" for classification of 2 or more classes, "Linear" for Regression. Defaults to "Logistic" for binary outcome, "Softmax" for 3+ classes, "LinearReg" for regression.
MXNET context: mxnet::mx.cpu()
to use CPU(s). Define N of cores using n.cores
argument.
mxnet::mx.gpu()
to use GPU. For multiple GPUs, provide list like such:
ctx = list(mxnet::mx.gpu(0), mxnet::mx.gpu(1)
to use two GPUs.
Logical: If TRUE, batch normalize before activation. Default = TRUE
Logical: If TRUE, apply L2 normalization after fully connected step. Default = FALSE
String vector: Activation types to use: 'relu', 'sigmoid', 'softrelu', 'tanh'.
If length < n of hidden layers, elements are recycled. See mxnet::mx.symbol.Activation
Integer: Number of iterations for training.
Float: learning rate
Float (0, 1): Probability of dropping nodes
Integer: Index of hidden layer before which dropout should be applied
Integer: Index of hidden layer after which dropout should be applied
String: Metrix used for evaluation during train. Default: "rmse"
Character: Name for feature set
Character: Name for outcome
Logical: if TRUE, plot the network structure using graphviz
Logical: if TRUE, produce plot using mplot3
Takes precedence over plot.fitted
and plot.predicted
Logical: if TRUE, plot True (y) vs Fitted
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires x.test
and y.test
String: "zero", "dark", "box", "darkbox"
String: the question you are attempting to answer with this model, in plain language.
Logical: If TRUE, print summary to screen.
Path to output directory.
If defined, will save Predicted vs. True plot, if available,
as well as full model output, if save.mod
is TRUE
Integer: Number of cores to use. Caution: Only set to >1 if you're sure MXNET is not using already using multiple cores
Logical. If TRUE, save all output as RDS file in outdir
save.mod
is TRUE by default if an outdir
is defined. If set to TRUE, and no outdir
is defined, outdir defaults to paste0("./s.", mod.name)
Additional parameters to be passed to mxnet::mx.model.FeedForward.create
Early stopping is considered after training has taken place for min.epochs
epochs.
After that point, early stopping is controlled by three criteria:
an absolute threshold (early.stop.absolute.threshold
),
a relative threshold (early.stop.relative.threshold
),
or a relative variance across a set number of steps (early.stop.realtiveVariance.threshold
along
early.stop.n.steps
).
Early stopping by default (if you change none of the early.stop
arguments), will look at training error
and stop when the relative variance of the loss over the last 24 steps (classification) or 12 steps (regression)
is lower than 5e-06 (classification) or lower than 5e-03 (regression). To set early stopping OFF, set all
early stopping criteria to NA.
It is important to tune learning rate and adjust max.epochs accordingly depending on the learning type
(Classification vs. Regression) and the specific dataset. Defaults can not be expected to work on all problems.
elevate for external cross-validation
Other Supervised Learning: s.ADABOOST
,
s.ADDTREE
, s.BART
,
s.BAYESGLM
, s.BRUTO
,
s.C50
, s.CART
,
s.CTREE
, s.DA
,
s.ET
, s.EVTREE
,
s.GAM.default
, s.GAM.formula
,
s.GAMSEL
, s.GAM
,
s.GBM3
, s.GBM
,
s.GLMNET
, s.GLM
,
s.GLS
, s.H2ODL
,
s.H2OGBM
, s.H2ORF
,
s.IRF
, s.KNN
,
s.LDA
, s.LM
,
s.MARS
, s.MLRF
,
s.NBAYES
, s.NLA
,
s.NLS
, s.NW
,
s.POLYMARS
, s.PPR
,
s.PPTREE
, s.QDA
,
s.QRNN
, s.RANGER
,
s.RFSRC
, s.RF
,
s.SGD
, s.SPLS
,
s.SVM
, s.TFN
,
s.XGBLIN
, s.XGB
Other Deep Learning: d.H2OAE
,
p.MXINCEPTION
, s.H2ODL
,
s.TFN