Fit a linear model and validate it. Options include base lm()
, robust linear model using
MASS:rlm()
, generalized least squares using nlme::gls
, or polynomial regression
using stats::poly
to transform features
s.LM(x, y = NULL, x.test = NULL, y.test = NULL, x.name = NULL,
y.name = NULL, weights = NULL, intercept = TRUE, robust = FALSE,
gls = FALSE, polynomial = FALSE, poly.d = 3, poly.raw = FALSE,
print.plot = TRUE, plot.fitted = NULL, plot.predicted = NULL,
plot.theme = getOption("rt.fit.theme", "lightgrid"),
na.action = na.exclude, question = NULL, rtclass = NULL,
verbose = TRUE, trace = 0, outdir = NULL,
save.mod = ifelse(!is.null(outdir), TRUE, FALSE), ...)
Numeric vector or matrix / data frame of features i.e. independent variables
Numeric vector of outcome, i.e. dependent variable
Numeric vector or matrix / data frame of testing set features
Columns must correspond to columns in x
Numeric vector of testing set outcome
Character: Name for feature set
Character: Name for outcome
Numeric vector: Weights for cases. For classification, weights
takes precedence
over ipw
, therefore set weights = NULL
if using ipw
.
Note: If weight
are provided, ipw
is not used. Leave NULL if setting ipw = TRUE
. Default = NULL
Logical: If TRUE, fit an intercept term. Default = TRUE
Logical: if TRUE, use MASS::rlm()
instead of base lm()
Logical: if TRUE, use nlme::gls
Logical: if TRUE, run lm on poly(x, poly.d)
(creates orthogonal polynomials)
Integer: degree of polynomial
Logical: if TRUE, use raw polynomials. Default, which should not really be changed is FALSE
Logical: if TRUE, produce plot using mplot3
Takes precedence over plot.fitted
and plot.predicted
Logical: if TRUE, plot True (y) vs Fitted
Logical: if TRUE, plot True (y.test) vs Predicted.
Requires x.test
and y.test
String: "zero", "dark", "box", "darkbox"
How to handle missing values. See ?na.fail
String: the question you are attempting to answer with this model, in plain language.
String: Class type to use. "S3", "S4", "RC", "R6"
Logical: If TRUE, print summary to screen.
Integer: If higher than 0, will print more information to the console. Default = 0
Path to output directory.
If defined, will save Predicted vs. True plot, if available,
as well as full model output, if save.mod
is TRUE
Logical. If TRUE, save all output as RDS file in outdir
save.mod
is TRUE by default if an outdir
is defined. If set to TRUE, and no outdir
is defined, outdir defaults to paste0("./s.", mod.name)
Additional arguments to be passed to MASS::rlm
if robust = TRUE
or MASS::lm.gls
if gls = TRUE
GLS can be useful in place of a standard linear model, when there is correlation among
the residuals and/or they have unequal variances.
Warning: nlme
's implementation is buggy, and predict
will not work
because of environment problems, which means it fails to get predicted values if
x.test
is provided.
robut = TRUE
trains a robust linear model using MASS::rlm
.
gls = TRUE
trains a generalized least squares model using nlme::gls
.
elevate for external cross-validation
Other Supervised Learning: s.ADABOOST
,
s.ADDTREE
, s.BART
,
s.BAYESGLM
, s.BRUTO
,
s.C50
, s.CART
,
s.CTREE
, s.DA
,
s.ET
, s.EVTREE
,
s.GAM.default
, s.GAM.formula
,
s.GAMSEL
, s.GAM
,
s.GBM3
, s.GBM
,
s.GLMNET
, s.GLM
,
s.GLS
, s.H2ODL
,
s.H2OGBM
, s.H2ORF
,
s.IRF
, s.KNN
,
s.LDA
, s.MARS
,
s.MLRF
, s.MXN
,
s.NBAYES
, s.NLA
,
s.NLS
, s.NW
,
s.POLYMARS
, s.PPR
,
s.PPTREE
, s.QDA
,
s.QRNN
, s.RANGER
,
s.RFSRC
, s.RF
,
s.SGD
, s.SPLS
,
s.SVM
, s.TFN
,
s.XGBLIN
, s.XGB
# NOT RUN {
x <- rnorm(100)
y <- .6 * x + 12 + rnorm(100)/2
mod <- s.LM(x, y)
# }
Run the code above in your browser using DataLab