Learn R Programming

DAAG (version 1.25.6)

CVlm: Cross-Validation for Linear Regression

Description

This function gives internal and cross-validation measures of predictive accuracy for multiple linear regression. (For binary logistic regression, use the CVbinary function.) The data are randomly assigned to a number of `folds'. Each fold is removed, in turn, while the remaining data is used to re-fit the regression model and to predict at the deleted observations.

Usage

CVlm(data = DAAG::houseprices, form.lm = formula(sale.price ~ area),
              m = 3, dots = FALSE, seed = 29, plotit = c("Observed","Residual"),
              col.folds=NULL,               
              main="Small symbols show cross-validation predicted values",
              legend.pos="topleft", 
              printit = TRUE, ...)
cv.lm(data = DAAG::houseprices, form.lm = formula(sale.price ~ area),
              m = 3, dots = FALSE, seed = 29, plotit = c("Observed","Residual"),
              col.folds=NULL,               
              main="Small symbols show cross-validation predicted values",
              legend.pos="topleft", printit = TRUE, ...)

Value

The input data frame is returned, with additional columns

Predicted (Predicted values using all observations) and cvpred (cross-validation predictions). The cross-validation residual sum of squares (ss) and degrees of freedom (df) are returned as attributes of the data frame.

Arguments

data

a data frame

form.lm

a formula or lm call or lm object

m

the number of folds

dots

uses pch=16 for the plotting character

seed

random number generator seed

plotit

This can be one of the text strings "Observed", "Residual", or a logical value. The logical TRUE is equivalent to "Observed", while FALSE is equivalent to "" (no plot)

col.folds

Per fold color settings

main

main title for graph

legend.pos

position of legend: one of "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center".

printit

if TRUE, output is printed to the screen

...

Other arguments, to be passed through to the function legend()

Author

J.H. Maindonald

Details

When plotit="Residual" and there is more than one explanatory variable, the fitted lines that are shown for the individual folds are approximations.

See Also

lm, CVbinary

Examples

Run this code
CVlm()
if (FALSE) {
CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
          plotit="Observed")
CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
     plotit="Residual")
out <- CVlm(data=nihills, form.lm=formula(log(time)~log(climb)+log(dist)),
               plotit="Observed")
out[c("ms","df")]
}

Run the code above in your browser using DataLab