fwbw.lm: Forward / backward-step model selection for an object of class 'lm'

Description

Model selection by a forward / backward-stepping algorithm. The algorithm reduces the degrees of freedom of an existing 'lm' object. It searches for the subset of degrees of freedom that results in an optimal goodness-of-fit. The optimal subset is the subset for which a user-specified function reaches its minimum.

Usage

# S3 method for lm
fwbw(object, fun, fw = FALSE, counter = TRUE,
  df_percentage = 0.05, control = list(), ...)

Arguments

object

Object of class 'lm'

fun

User-specified function which measures the goodness-of-fit. See 'Details'.

Boolean, if TRUE the search will start with a minimum degrees of freedom ('forward search'). If FALSE the search will start with the full model ('backward search').

counter

Boolean, if TRUE and fw = TRUE, the algorithm will carry out backward steps (attempts to remove degrees of freedom) while searching for the optimal subset. If FALSE and fw = TRUE, the algorithm will only carry out forward steps (attempts to insert degrees if freedom). The effect of counter is opposite if fw = FALSE.

df_percentage

Percentage of degrees of freedom that the algorithm attempts to remove at a backward-step, or insert at a forward_step. Must be a number between 0 and 1.

control

List of control options. The following options can be set

monitor Boolean, if TRUE information about the attempted removals and insertions will be printed during the run. Default is FALSE.
plot Boolean, if TRUE a plot will be shown at the end of the run. It shows how the value of fun decreases during the run. Default is FALSE.

...

for compatibility with fwbw generic

Value

A list with the following members.

object An object of class 'lm' which contains the model for which fun is minimized.
fun The minimum value of the user-specified function fun.

Details

Description of the algorithm

The function fwbw.lm selects the subset of all the degrees of freedom present in object for which the user-specified function fun is minimized. This function is supposed to be a measure for the foodness-of-fit. Typical examples would be fun=AIC or fun=BIC. The function fun can also be a measure of the prediction error, determined by cross-validation.

This function is intended for situations in which the degrees of freedom in object is so large that it is not feasible to go through all possible subsets systematically to find the smallest value of fun. Instead, the algorithm generates subsets by removing degrees of freedom from the current-best subset (a 'backward' step) and reinserting degrees of freedom that were previously removed (a 'forward' step). Whenever a backward or forward step results in a subset for which fun is smaller than for the current-best subset, the new subset becomes current-best.

The start set depends on the argument fw. If fw = TRUE, the algorithm starts with only one degree of freedom for the expected values \(\mu\). This degree is the intercept term, if the model in object contains an intercept term. If fw = FALSE (the default), the algorithm starts with all degrees of freedom present in object.

At a backward step, the model removes df_percentage of the degrees of freedom of the current-best subset (with a minimum of 1 degree of freedom). The degrees that are removed are the ones with the largest p-value (p-values can be seen with the function summary.lm). If the removal results in a larger value of fun, the algorithm will try again by halving the degrees of freedom it removes.

At a forward step, inserts df_percentage of the degrees of freedom that are present in object but left out in the current-best subset (with a minimum of 1 degree of freedom). It inserts those degees of freedom which are estimated to increase the likelihood most. If the insertion results in a larger value of fun, the algorithm will try again by halving the degrees of freedom it inserts.

If counter = FALSE, the algorithm is 'greedy': it will only carry out forward-steps in case fw = TRUE or backward-steps in case fw = FALSE.

The algorithm stops if neither the backward nor the forward step resulted in a lower value of fun. It returns the current-best model and the minimum value of fun.

The user-defined function

The function fun must be a function which is a measure for the goodness-of-fit. It must take one argument: an object of class 'lm'. Its return value must be a single number. A smaller number (more negative) must represent a better fit. During the run, a fit to the data is carried out for each new subset of degrees of freedom. The result of the fit is an object of class 'lm'. This object is passed on to fun to evaluate the goodness-of-fit. Typical examples for fun are AIC and BIC.

Monitor information

When the control-option monitor is equal to TRUE, information is displayed about the progress of the run. The following information is displayed:

Iteration A counter which first value is always 0, followed by 1. From then on, the counter is increased whenever the addition or removal of degrees of freedom results in a smaller function value than the smallest so far.
attempted removals/insertions The number of degrees of freedoms that one attempts to remove or insert
function value The value of the user-specified function fun after the removal or insertion of the degrees of freedom
The last column shows the word insert when the attempt regards the insertion of degrees of freedom. When nothing is shown, the algorithm attempted to remove degrees of freedom.

Other

If the model matrix present in object conatains a column with the name "(Intercept)", the intercept term for the expected values \(\mu\) will not be removed by fwbw.lm.

When a new subset of degrees of freedom is generated by either a backward or a forward step, the response vector in object is fitted to the new model. The fit is carried out by lm.

Examples

Run this code

# NOT RUN {
# Generate model matrix
set.seed(1820)

n_rows = 1000
n_cols = 4

X = matrix(sample(-9:9, n_rows * n_cols, replace = TRUE), nrow = n_rows, ncol = n_cols)

column_names = sapply(1:n_cols, function(i_column){paste("column", i_column, sep = "_")})
colnames(X) = column_names

# Generate betas
beta = sample(c(-1,-0.5, 0.5, 1), n_cols + 1, replace = TRUE)

# Generate response vector
mu = X %*% beta[-1] + beta[1]
y = rnorm( n_rows, mean = mu, sd = 2.5)

# Add columns for cross-terms to model matrix. They have no predictive power for the response y.
X = model.matrix(~ . + 0 + column_1 * ., data = as.data.frame(X))
colnames(X)

# Create model in which cross-terms in X are unrelated to response vector y.
fit = lm(y ~ ., as.data.frame(X), x = TRUE, y = TRUE)

# Check whether model selection with BIC as criterion manages
# to remove cross-terms. Start with the full model. Monitor the iterations.
fwbw = fwbw(fit, BIC, control = list(monitor = TRUE))
names(coef(fwbw$object))

# The same with AIC as criterion. Plot how the AIC develops.
fwbw = fwbw(fit, AIC, control = list(plot = TRUE))
names(coef(fwbw$object))

# Model selection starting with an intercept term only.
fwbw = fwbw(fit, BIC, fw = TRUE)
names(coef(fwbw$object))
# }

Run the code above in your browser using DataLab