Train a Neural Network using keras and tensorflow
s.TFN(x, y = NULL, x.test = NULL, y.test = NULL, x.valid = NULL,
y.valid = NULL, upsample = FALSE, upsample.seed = NULL,
net = NULL, n.hidden.nodes = NULL,
initializer = c("glorot_uniform", "glorot_normal", "he_uniform",
"he_normal", "lecun_uniform", "lecun_normal", "random_uniform",
"random_normal", "variance_scaling", "truncated_normal", "orthogonal",
"zeros", "ones", "constant"), initializer.seed = NULL, dropout = 0,
activation = c("relu", "selu", "elu", "sigmoid", "hard_sigmoid",
"tanh", "exponential", "linear", "softmax", "softplus", "softsign"),
l1 = 0, l2 = 0, batch.normalization = TRUE, output = NULL,
loss = NULL, optimizer = c("rmsprop", "adadelta", "adagrad", "adam",
"adamax", "nadam", "sgd"), learning.rate = NULL, metric = NULL,
epochs = 50, batch.size = NULL, validation.split = 0.2,
callback = keras::callback_early_stopping(patience = 150),
scale = TRUE, x.name = NULL, y.name = NULL, print.plot = TRUE,
print.error.plot = NULL, rtlayout.mat = c(2, 1),
plot.fitted = NULL, plot.predicted = NULL,
plot.theme = getOption("rt.fit.theme", "lightgrid"), question = NULL,
verbose = TRUE, verbose.checkpoint = FALSE, outdir = NULL,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)
Numeric vector or matrix / data frame of features i.e. independent variables
Numeric vector of outcome, i.e. dependent variable
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in x
Numeric vector of testing set outcome
Logical: If TRUE, upsample cases to balance outcome classes (for Classification only) Caution: upsample will randomly sample with replacement if the length of the majority class is more than double the length of the class you are upsampling, thereby introducing randomness
Integer: If provided, will be used to set the seed during upsampling. Default = NULL (random seed)
Integer vector: Length must be equal to the number of hidden layers you wish to create. Can be zero (~GLM)
String: Initializer to use for each layer: "glorot_uniform", "glorot_normal", "he_uniform", "he_normal", "cun_uniform", "lecun_normal", "random_uniform", "random_normal", "variance_scaling", "truncated_normal", "orthogonal", "zeros", "ones", "constant". Glorot is also known as Xavier initialization. Default = "glorot_uniform"
Integer: Seed to use for each initializer for reproducibility. Default = NULL
Floar, vector, (0, 1): Probability of dropping nodes. Can be a vector of length equal to N of layers, otherwise will be recycled. Default = 0
String vector: Activation type to use: "relu", "selu", "elu", "sigmoid", "hard_sigmoid", "tanh", "exponential", "linear", "softmax", "softplus", "softsign". Default = "relu"
Logical: If TRUE, batch normalize after each hidden layer. Default = TRUE
String: Activation to use for output layer. Can be any as in activation
.
Default = "linear" for Regression, "sigmoid" for binary classification, "softmax" for multiclass
String: Loss to use: Default = "mean_squared_error" for regression, "binary_crossentropy" for binary classification, "sparse_categorical_crossentropy" for multiclass
String: Optimization to use: "rmsprop", "adadelta", "adagrad", "adam", "adamax", "nadam", "sgd". Default = "rmsprop"
Float: learning rate. Defaults depend on optimizer
used and are:
rmsprop = .001, adadelta = 1, adagrad = .01, adamax = .002, adam = .001, nadam = .002, sgd = .1
String: Metric used for evaluation during train. Default = "mse" for regression, "accuracy" for classification.
Integer: Number of epochs. Default = 100
Integer: Batch size. Default = N of cases
Float (0, 1): proportion of training data to use for validation. Default = .2
Function to be called by keras during fitting.
Default = keras::callback_early_stopping(patience = 150)
for early stopping.
Logical: If TRUE, scale featues before training. Default = TRUE
column means and standard deviation will be saved in rtMod$extra
field to allow
scaling ahead of prediction on new data
Character: Name for feature set
Character: Name for outcome
Logical: if TRUE, produce plot using mplot3
Takes precedence over plot.fitted
and plot.predicted
Logical: if TRUE, plot True (y) vs Fitted
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires x.test
and y.test
String: "zero", "dark", "box", "darkbox"
String: the question you are attempting to answer with this model, in plain language.
Logical: If TRUE, print summary to screen.
Path to output directory.
If defined, will save Predicted vs. True plot, if available,
as well as full model output, if save.mod
is TRUE
Logical. If TRUE, save all output as RDS file in outdir
save.mod
is TRUE by default if an outdir
is defined. If set to TRUE, and no outdir
is defined, outdir defaults to paste0("./s.", mod.name)
Additional parameters
For more information on argument and hyperparameters, see (https://keras.rstudio.com/) and (https://keras.io/) It is important to define network structure and adjust hyperparameters. You cannot expect defaults to work on any dataset.
elevate for external cross-validation
Other Supervised Learning: s.ADABOOST
,
s.ADDTREE
, s.BART
,
s.BAYESGLM
, s.BRUTO
,
s.C50
, s.CART
,
s.CTREE
, s.DA
,
s.ET
, s.EVTREE
,
s.GAM.default
, s.GAM.formula
,
s.GAMSEL
, s.GAM
,
s.GBM3
, s.GBM
,
s.GLMNET
, s.GLM
,
s.GLS
, s.H2ODL
,
s.H2OGBM
, s.H2ORF
,
s.IRF
, s.KNN
,
s.LDA
, s.LM
,
s.MARS
, s.MLRF
,
s.MXN
, s.NBAYES
,
s.NLA
, s.NLS
,
s.NW
, s.POLYMARS
,
s.PPR
, s.PPTREE
,
s.QDA
, s.QRNN
,
s.RANGER
, s.RFSRC
,
s.RF
, s.SGD
,
s.SPLS
, s.SVM
,
s.XGBLIN
, s.XGB
Other Deep Learning: d.H2OAE
,
p.MXINCEPTION
, s.H2ODL
,
s.MXN