Learn R Programming

Ecfun (version 0.3-2)

simulate.glm: A "simulate" method for a glm object

Description

Simulate predictions for newdata for a model of class glm with mean coef(object) and variance vcov(object).

NOTES: The stats package has a simulate method for "lm" objects which is used for lm and glm objects. It differs from the current simulate.glm function in two fundamental and important ways:

  1. stats::simulate returns simulated data consistent with the model fit assuming the estimated model parameters are true and exact, i.e., ignoring the uncertainty in parameter estimation. Thus, if family = poisson, stats::simulate returns nonnegative integers. By contrast the simulate.glm function documented here returns optionally simulated coef (coefficients) plus simulated values for the link and / or response but currently NOT pseudo-random numbers on the scale of the response.

  2. The simulate.glm function documented here also accepts an optional newdata argument, not accepted by stats::simulate. The stats::simulate function only returns simulated values for the cases in the training set with no possibilities for use for different sets of conditions.

Usage

# S3 method for glm
simulate(object, nsim = 1, 
    seed = NULL, newdata=NULL, 
    type = c("coef", "link", "response"), ...)

Value

Returns either a data.frame or a list of data.frames depending on 'type':

coef

a data.frame with nsim columns giving simulated parameters generated using mvtnorm::rmvnorm(nsim, coef(object), vcov(object)).

link

a data.frame with nsim columns of nobs values each giving the simulations on the link scale by applying each set of simulated coefficients to newdata (or to the training set of newdata is not supplied).

response

a data.frame with nsim columns of nobs values each giving the simulations on the response scale, being linkinv of the simulations on the link scale.

if length(type)>1

a list with simulations on the desired scales.

The value also has an attribute "seed". If argument seed is NULL, the attribute is the value of

.Random.seed before the simulation started. Otherwise it is the value of the argument with a kind attribute with value as.list(RNGkind()).

NOTE: This function currently may not work with a model fit that involves a multivariate

link or response.

Arguments

object

an object representing a fitted model of class glm.

nsim

number of response vectors to simulate. Defaults to 1.

seed

Argument passed as the first argument to set.seed if not NULL.

newdata

optionally, a data.frame in which to look for variables with which to predict. If omitted, predictors used in fitting are used.

type

the type of simulations required.

  • coef Simulated coefficents using mvtnorm::rmvnorm(nsim, coef(object), vcov(object)).

  • link The default type='link' is on the scale of the linear predictors using rmvnorm applied to randomly selected components of the mixture with mean = coef and sigma = vcov for that component. For a default binomial model, these are of log-odds (probabilities on logit scale).

  • response object[['linkinv']] of type = 'link'. For a binomial model, these are predicted probabilities.

...

further arguments passed to or from other methods.

Author

Spencer Graves

Details

1. Save current seed and optionally set it using code copied from stats:::simulate.lm.

2. if(is.null(newdata))newdata gets the data used in the call to glm.

3. newMat <- model.matrix(~., newdata)

4. simCoef <- mvtnorm::rmvnorm(nsim, coef(object), vcov(object))

5. sims <- tcrossprod(newMat, simCoef)

6. If length(type) == 1: return a data.frame with one column for each desired simulation, consistent with the behavior of the generic simulate applied to objects of class lm or glm. Otherwise, return a list of data.frames of the desired types.

See Also

Examples

Run this code
library(mvtnorm)
##
## 1.  a factor and a numeric 
##
PoisReg2 <- data.frame(y=1:6, 
    x=factor(rep(0:2, 2)), x1=rep(1:2, e=3))
GLMpoisR2 <- glm(y~x+x1, poisson, PoisReg2)

newDat. <- data.frame(
  x=factor(rep(c(0, 2), 2), levels=0:2), 
  x1=3:6)
# NOTE:  Force newDat2['x'] to have the same levels
# as PoisReg2['x']

GLMsim2n <- simulate(GLMpoisR2, nsim=3, seed=2,
  newdata=newDat.)

##
## 2.  One variable:  BMA returns
##     a mixture of constant & linear models
##
PoisRegDat <- data.frame(x=1:2, y=c(5, 10))
GLMex <- glm(y~x, poisson, PoisRegDat)

# Simulate for the model data 
GLMsig <- simulate(GLMex, nsim=2, seed=1)  

# Simulate for new data
newDat <- data.frame(x=3:4, 
      row.names=paste0('f', 3:4))
GLMsio <- simulate(GLMex, nsim=3, seed=2, 
                      newdata=newDat)

##
## 2a.  Compute the correct answers manually 
##
newMat <- model.matrix(~., newDat)
RNGstate <- structure(2, kind = as.list(RNGkind()))
set.seed(2)

sim <- mvtnorm::rmvnorm(3, coef(GLMex), 
                         vcov(GLMex))
rownames(sim) <- paste0('sim_', 1:3)
simDF <- data.frame(t(sim))

GLMsim.l <- tcrossprod(newMat, sim)
colnames(GLMsim.l) <- paste0('sim_', 1:3)
GLMsim.r <- exp(GLMsim.l) 
GLMsim2 <- list(coef=simDF, 
  link=data.frame(GLMsim.l), 
  response=data.frame(GLMsim.r) )
attr(GLMsim2, 'seed') <- RNGstate  

stopifnot(
all.equal(GLMsio, GLMsim2)
)

Run the code above in your browser using DataLab