s.MXN: Neural Network with `mxnet` [C, R]

Description

Train a Neural Network using mxnet with optional early stopping

Usage

s.MXN(x, y = NULL, x.test = NULL, y.test = NULL, x.valid = NULL,
  y.valid = NULL, upsample = FALSE, upsample.seed = NULL,
  net = NULL, n.hidden.nodes = NULL, output = NULL,
  ctx = mxnet::mx.cpu(), initializer = mxnet::mx.init.Xavier(),
  batch.normalization = TRUE, l2.normalization = FALSE,
  activation = "relu", optimizer = "adadelta", batch.size = NULL,
  momentum = 0.9, max.epochs = 1000, min.epochs = 25,
  early.stop = c("train", "valid"), early.stop.absolute.threshold = NA,
  early.stop.relative.threshold = NA,
  early.stop.relativeVariance.threshold = NULL,
  early.stop.n.steps = NULL, learning.rate = NULL, dropout = 0,
  dropout.before = 1, dropout.after = 0, eval.metric = NULL,
  minimize = NULL, arg.params = NULL, mx.seed = NULL,
  x.name = NULL, y.name = NULL, plot.graphviz = FALSE,
  print.plot = TRUE, print.error.plot = NULL, rtlayout.mat = c(2, 1),
  plot.fitted = NULL, plot.predicted = NULL,
  plot.theme = getOption("rt.fit.theme", "lightgrid"), question = NULL,
  verbose = TRUE, verbose.mxnet = TRUE, verbose.checkpoint = FALSE,
  outdir = NULL, n.cores = rtCores,
  save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)

Arguments

Numeric vector or matrix / data frame of features i.e. independent variables

Numeric vector of outcome, i.e. dependent variable

x.test

Numeric vector or matrix / data frame of testing set features Columns must correspond to columns in x

y.test

Numeric vector of testing set outcome

upsample

Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Caution: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness

upsample.seed

Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)

net

MXNET Symbol: provide a previously defined network. logger will not work in this case at the moment, so early stopping cannot be applied

n.hidden.nodes

Integer vector: Length must be equal to the number of hidden layers you wish to create

output

String: "Logistic" for binary classification, "Softmax" for classification of 2 or more classes, "Linear" for Regression. Defaults to "Logistic" for binary outcome, "Softmax" for 3+ classes, "LinearReg" for regression.

ctx

MXNET context: mxnet::mx.cpu() to use CPU(s). Define N of cores using n.cores argument. mxnet::mx.gpu() to use GPU. For multiple GPUs, provide list like such: ctx = list(mxnet::mx.gpu(0), mxnet::mx.gpu(1) to use two GPUs.

batch.normalization

Logical: If TRUE, batch normalize before activation. Default = TRUE

l2.normalization

Logical: If TRUE, apply L2 normalization after fully connected step. Default = FALSE

activation

String vector: Activation types to use: 'relu', 'sigmoid', 'softrelu', 'tanh'. If length < n of hidden layers, elements are recycled. See mxnet::mx.symbol.Activation

max.epochs

Integer: Number of iterations for training.

learning.rate

Float: learning rate

dropout

Float (0, 1): Probability of dropping nodes

dropout.before

Integer: Index of hidden layer before which dropout should be applied

dropout.after

Integer: Index of hidden layer after which dropout should be applied

eval.metric

String: Metrix used for evaluation during train. Default: "rmse"

x.name

Character: Name for feature set

y.name

Character: Name for outcome

plot.graphviz

Logical: if TRUE, plot the network structure using graphviz

print.plot

Logical: if TRUE, produce plot using mplot3 Takes precedence over plot.fitted and plot.predicted

plot.fitted

Logical: if TRUE, plot True (y) vs Fitted

plot.predicted

Logical: if TRUE, plot True (y.test) vs Predicted. Requires x.test and y.test

plot.theme

String: "zero", "dark", "box", "darkbox"

question

String: the question you are attempting to answer with this model, in plain language.

verbose

Logical: If TRUE, print summary to screen.

outdir

Path to output directory. If defined, will save Predicted vs. True plot, if available, as well as full model output, if save.mod is TRUE

n.cores

Integer: Number of cores to use. Caution: Only set to >1 if you're sure MXNET is not using already using multiple cores

save.mod

Logical. If TRUE, save all output as RDS file in outdir save.mod is TRUE by default if an outdir is defined. If set to TRUE, and no outdir is defined, outdir defaults to paste0("./s.", mod.name)

...

Additional parameters to be passed to mxnet::mx.model.FeedForward.create

Details

Early stopping is considered after training has taken place for min.epochs epochs. After that point, early stopping is controlled by three criteria: an absolute threshold (early.stop.absolute.threshold), a relative threshold (early.stop.relative.threshold), or a relative variance across a set number of steps (early.stop.realtiveVariance.threshold along early.stop.n.steps). Early stopping by default (if you change none of the early.stop arguments), will look at training error and stop when the relative variance of the loss over the last 24 steps (classification) or 12 steps (regression) is lower than 5e-06 (classification) or lower than 5e-03 (regression). To set early stopping OFF, set all early stopping criteria to NA. It is important to tune learning rate and adjust max.epochs accordingly depending on the learning type (Classification vs. Regression) and the specific dataset. Defaults can not be expected to work on all problems.

Description

Usage

Arguments

Details

See Also