modavgpred: Compute Model-averaged Predictions

Description

This function computes the model-averaged predictions and unconditional standard errors based on the entire candidate model set. The function is currently implemented for 'lm', 'glm', 'lme', and 'mer' object classes that are stored in a list.

Usage

modavgpred(cand.set, modnames, newdata, type = "response", c.hat = 1,
gamdisp = NULL, second.ord = TRUE, nobs = NULL, uncond.se = "revised") 
modavgpred.glm(cand.set, modnames, newdata, type = "response",
c.hat = 1, gamdisp = NULL, second.ord = TRUE, nobs = NULL,
uncond.se = "revised") 
modavgpred.lme(cand.set, modnames, newdata, second.ord = TRUE,
         nobs = NULL, uncond.se = "revised")
modavgpred.mer(cand.set, modnames, newdata, type = "response",
  c.hat = 1, second.ord = TRUE, nobs = NULL, uncond.se = "revised")

Arguments

cand.set

a list storing each of the models in the candidate model set.

modnames

a character vector of model names to facilitate the identification of each model in the model selection table.

newdata

a data frame with the same structure as that of the original data frame for which we want to make predictions.

type

the scale of prediction requested, one of "response" or "link" (only relevant to 'glm' and 'mer' classes). Note that the value "terms" is not defined for 'modavgpred').

c.hat

value of overdispersion parameter (i.e., variance inflation factor) such as that obtained from 'c_hat'. Note that values of c.hat different from 1 are only appropriate for binomial GLM's with trials > 1 (i.e., success/trial or cbind(success, failure) syn

gamdisp

the value of the gamma dispersion parameter.

second.ord

logical. If TRUE, the function returns the second-order Akaike information criterion (i.e., AICc).

nobs

this argument allows to specify a numeric value other than total sample size to compute the AICc (defaults to sample size). This is relevant only for mixed models where sample size is not straightforward. In such cases, one might use total number of obse

uncond.se

either, "old", or "revised", specifying the equation used to compute the unconditional standard error of a model-averaged estimate. With uncond.se = "old", computations are based on equation 4.9 of Burnham and Anderson (2002), which was the former way to

Value

'modavgpred' returns an object of class 'modavgpred' with the following components:
typethe scale of predicted values (response or link) for 'glm' or 'mer' classes.
mod.avg.predthe model-averaged prediction over the entire candidate model set.
uncond.sethe unconditional standard error of each model-averaged prediction.

Details

'modavgpred' is a function that calls 'modavgpred.glm', 'modavgpred.lme', or 'modavgpred.mer', depending on the class of the object. The candidate models must be stored in a list. Note that a data frame from which to make predictions must be supplied with the 'newdata' argument and that all variables appearing in the model set must appear in this data frame. One can compute unconditional confidence intervals around the predictions from the elements returned by 'modavgpred'. The classic computation based on asymptotic normality of the estimator is appropriate to estimate confidence intervals of beta estimates (i.e., estimates on the linear predictor scale). For predictions of some types of response variables (e.g., discrete values such as counts, or binary variables), the normal approximation may be inappropriate. In such cases, it is often better to compute the confidence intervals on the linear predictor scale and then back-transform the limits to the scale of the response variable. Burnham and Anderson (2002, p. 164) suggest alternative methods of computing unconditional confidence intervals for small degrees of freedom with profile likelihood intervals or bootstrapping.

References

Anderson, D. R. (2008) Model-based Inference in the Life Sciences: a primer on evidence. Springer: New York.

Burnham, K. P., Anderson, D. R. (2002) Model Selection and Multimodel Inference: a practical information-theoretic approach. Second edition. Springer: New York.

Examples

Run this code

##example from subset of models in Table 1 in Mazerolle (2006)
data(dry.frog)

Cand.models <- list( )
Cand.models[[1]] <- lm(log_Mass_lost ~ Shade + Substrate +
cent_Initial_mass + Initial_mass2, data = dry.frog)
Cand.models[[2]] <- lm(log_Mass_lost ~ Shade + Substrate +
cent_Initial_mass + Initial_mass2 + Shade:Substrate, data = dry.frog)
Cand.models[[3]] <- lm(log_Mass_lost ~ cent_Initial_mass +
Initial_mass2, data = dry.frog)
Cand.models[[4]] <- lm(log_Mass_lost ~ Shade + cent_Initial_mass +
Initial_mass2, data = dry.frog)
Cand.models[[4]] <- lm(log_Mass_lost ~ Shade + cent_Initial_mass +
Initial_mass2, data = dry.frog)
Cand.models[[5]] <- lm(log_Mass_lost ~ Substrate + cent_Initial_mass +
Initial_mass2, data = dry.frog)

##setup model names
Modnames <- NULL
for (i in 1:length(Cand.models)) {
Modnames[i] <- paste("mod", i, sep = "")
}

##compute model-averaged value and unconditional SE of predicted log of
##mass lost for frogs of average mass in shade for each substrate type

##first create data set to use for predictions
new.dat <- data.frame(Shade = c(1, 1, 1), cent_Initial_mass = c(0, 0,
0), Initial_mass2 = c(0, 0, 0), Substrate = c("SOIL", "SPHAGNUM",
"PEAT")) 

##compare unconditional SE's using both methods
modavgpred(cand.set = Cand.models, modnames = Modnames, newdata =
new.dat, type = "response", uncond.se = "old")
modavgpred(cand.set = Cand.models, modnames = Modnames, newdata =
new.dat, type = "response", uncond.se = "revised")
##round to 4 digits after decimal point
print(modavgpred(cand.set = Cand.models, modnames = Modnames, newdata =
new.dat, type = "response", uncond.se = "revised"), digits = 4)

##Gamma glm
##clotting data example from 'gamma.shape' in MASS package of
##Venables and Ripley (2002, Modern applied statistics with
##S. Springer-Verlag: New York.)
clotting <- data.frame(u = c(5, 10, 15, 20, 30, 40, 60, 80, 100),
         lot1 = c(118, 58, 42, 35, 27, 25, 21, 19, 18),
         lot2 = c(69, 35, 26, 21, 18, 16, 13, 12, 12))
clot1 <- glm(lot1 ~ log(u), data = clotting, family = Gamma)

library(MASS)
gamma.dispersion(clot1) #dispersion parameter
gamma.shape(clot1) #reciprocal of dispersion parameter ==
##shape parameter 
summary(clot1, dispersion = gamma.dispersion(clot1))  #better

##create list with models
Cand <- list()
Cand[[1]] <- glm(lot1 ~ log(u), data = clotting, family = Gamma)
Cand[[2]] <- glm(lot1 ~ 1, data = clotting, family = Gamma)

##create vector of model names
Modnames <- NULL
for (i in 1:length(Cand)) {
Modnames[i] <- paste("mod", i, sep = "")
}

##compute model-averaged predictions on scale of response variable for
##all observations
modavgpred(cand.set = Cand, modnames = Modnames, newdata = clotting,
gamdisp = gamma.dispersion(clot1), type = "response") 

##compute model-averaged predictions on scale of linear predictor
modavgpred(cand.set = Cand, modnames = Modnames, newdata = clotting,
gamdisp = gamma.dispersion(clot1), type = "link")

##compute model-averaged predictions on scale of linear predictor
modavgpred(cand.set = Cand, modnames = Modnames, newdata = clotting,
gamdisp = gamma.dispersion(clot1), type = "terms") #returns an error
##because type = "terms" is not defined for 'modavgpred'
modavgpred(cand.set = Cand, modnames = Modnames, newdata = clotting,
type = "terms") #returns an error because no gamma dispersion parameter
##(i.e., 'gamdisp') was specified

Run the code above in your browser using DataLab