Fit a linear model and validate it. Options include base lm(), robust linear model using
MASS:rlm(), generalized least squares using nlme::gls, or polynomial regression
using stats::poly to transform features
s.LM(x, y = NULL, x.test = NULL, y.test = NULL, x.name = NULL,
y.name = NULL, weights = NULL, intercept = TRUE, robust = FALSE,
gls = FALSE, polynomial = FALSE, poly.d = 3, poly.raw = FALSE,
print.plot = TRUE, plot.fitted = NULL, plot.predicted = NULL,
plot.theme = getOption("rt.fit.theme", "lightgrid"),
na.action = na.exclude, question = NULL, rtclass = NULL,
verbose = TRUE, trace = 0, outdir = NULL,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)Numeric vector or matrix / data frame of features i.e. independent variables
Numeric vector of outcome, i.e. dependent variable
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in x
Numeric vector of testing set outcome
Character: Name for feature set
Character: Name for outcome
Numeric vector: Weights for cases. For classification, weights takes precedence
over ipw, therefore set weights = NULL if using ipw.
Note: If weight are provided, ipw is not used. Leave NULL if setting ipw = TRUE. Default = NULL
Logical: If TRUE, fit an intercept term. Default = TRUE
Logical: if TRUE, use MASS::rlm() instead of base lm()
Logical: if TRUE, use nlme::gls
Logical: if TRUE, run lm on poly(x, poly.d) (creates orthogonal polynomials)
Integer: degree of polynomial
Logical: if TRUE, use raw polynomials. Default, which should not really be changed is FALSE
Logical: if TRUE, produce plot using mplot3
Takes precedence over plot.fitted and plot.predicted
Logical: if TRUE, plot True (y) vs Fitted
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires x.test and y.test
String: "zero", "dark", "box", "darkbox"
How to handle missing values. See ?na.fail
String: the question you are attempting to answer with this model, in plain language.
String: Class type to use. "S3", "S4", "RC", "R6"
Logical: If TRUE, print summary to screen.
Integer: If higher than 0, will print more information to the console. Default = 0
Path to output directory.
If defined, will save Predicted vs. True plot, if available,
as well as full model output, if save.mod is TRUE
Logical. If TRUE, save all output as RDS file in outdir
save.mod is TRUE by default if an outdir is defined. If set to TRUE, and no outdir
is defined, outdir defaults to paste0("./s.", mod.name)
Additional arguments to be passed to MASS::rlm if robust = TRUE
or MASS::lm.gls if gls = TRUE
GLS can be useful in place of a standard linear model, when there is correlation among
the residuals and/or they have unequal variances.
Warning: nlme's implementation is buggy, and predict will not work
because of environment problems, which means it fails to get predicted values if
x.test is provided.
robut = TRUE trains a robust linear model using MASS::rlm.
gls = TRUE trains a generalized least squares model using nlme::gls.
elevate for external cross-validation
Other Supervised Learning: s.ADABOOST,
s.ADDTREE, s.BART,
s.BAYESGLM, s.BRUTO,
s.C50, s.CART,
s.CTREE, s.DA,
s.ET, s.EVTREE,
s.GAM.default, s.GAM.formula,
s.GAMSEL, s.GAM,
s.GBM3, s.GBM,
s.GLMNET, s.GLM,
s.GLS, s.H2ODL,
s.H2OGBM, s.H2ORF,
s.IRF, s.KNN,
s.LDA, s.MARS,
s.MLRF, s.MXN,
s.NBAYES, s.NLA,
s.NLS, s.NW,
s.POLYMARS, s.PPR,
s.PPTREE, s.QDA,
s.QRNN, s.RANGER,
s.RFSRC, s.RF,
s.SGD, s.SPLS,
s.SVM, s.TFN,
s.XGBLIN, s.XGB
# NOT RUN {
x <- rnorm(100)
y <- .6 * x + 12 + rnorm(100)/2
mod <- s.LM(x, y)
# }
Run the code above in your browser using DataLab