stackingWeights: Stacking model weights

Description

Computes model weights based on a cross-validation-like procedure.

Usage

stackingWeights(object, ..., data, R, p = 0.5)

Arguments

object, …

two or more fitted glm objects, or a list of such, or an "averaging" object.

data

a data frame containing the variables in the model, used for fitting and prediction.

the number of replicates.

the proportion of the data to be used as training set. Defaults to 0.5.

Value

stackingWeights returns a matrix with two rows, holding model weights calculated using mean and median.

Details

Each model in a set is fitted to the training data: a subset of p * N observations in data. From these models a prediction is produced on the remaining part of data (the test or hold-out data). These hold-out predictions are fitted to the hold-out observations, by optimising the weights by which the models are combined. This process is repeated R times, yielding a distribution of weights for each model (which Smyth & Wolpert (1998) referred to as an ‘empirical Bayesian estimate of posterior model probability’). A mean or median of model weights for each model is taken and re-scaled to sum to one.

References

Wolpert, D. H. (1992) Stacked generalization. Neural Networks, 5: 241-259.

Smyth, P. & Wolpert, D. (1998) An Evaluation of Linearly Combining Density Estimators via Stacking. Technical Report No. 98-25. Information and Computer Science Department, University of California, Irvine, CA.

Examples

Run this code

# NOT RUN {
# global model fitted to training data:
fm <- glm(y ~ X1 + X2 + X3 + X4, data = Cement, na.action = na.fail)
# generate a list of *some* subsets of the global model
models <- lapply(dredge(fm, evaluate = FALSE, fixed = "X1", m.lim = c(1, 3)), eval)

wts <- stackingWeights(models, data = Cement, R = 10)

ma <- model.avg(models)
Weights(ma) <- wts["mean", ]

predict(ma)

# }

Run the code above in your browser using DataLab