plot_effects: Partial-dependence effects and plots

Description

The following functions evaluate or plot partial-dependence effects. The is a dedicated package for such plots, pdp (https://cran.r-project.org/package=pdp), so if you are not happy with these functions (which, for instance, do not handle pairs of variables and their interactions), try that package (which seems to handle fit object produced by spaMM).

pdep_effects evaluates the effect of a given fixed-effect variable, as (by default, the average of) predicted values on the response scale, over the empirical distribution of all other fixed-effect variables in the data, and of inferred random effects. This can be seen as the result of an experiment where specific treatments (given values of the focal variable) are applied over all conditions defined by the other fixed effects and by the inferred random effects. Thus, apparent dependencies induced by associations between predictor variables are avoided (see Friedman, 2001, from which the name “partial dependence plot” is taken; or Hastie et al., 2009, Section 10.13.2). This also avoids biases of possible alternative ways of plotting effects. In particular, such biases occur if the response link is not identity, and if averaging is performed on the linear-predictor scale or when other variables are set to some conventional value other than its average.

pdep_effects also compute intervals of the type defined by its intervals argument (by default, prediction intervals). By default, it returns a data frame of average values of point predictions and interval bounds for each value of the focal variable, but it can also return lists of all predictions.

plot_effects calls pdep_effects and produces a simple plot (using only base graphic functions) of its results, including prediction bands representing the two average one-sided widths of intervals. If added to the plot, the raw data may appear to depart from the partial-dependence predictions, since the data are a priori affected by the associations between variables which the predictions free themselves from.

Usage

pdep_effects(object, focal_var, newdata = object$data, length.out = 20, 
             levels = NULL, intervals = "predVar", indiv = FALSE, ...)
plot_effects(object, focal_var, newdata = object$data, effects = NULL,
             xlab = focal_var, ylab = NULL, rgb.args = col2rgb("blue"), 
             add = FALSE,  ylim=NULL, ...)

Arguments

object

An object of class HLfit, as returned by the fitting functions in spaMM.

focal_var

Character string: the name of the predictor variable whose effect is to be represented

newdata

If non-NULL, a data frame passed to predict.HLfit, whose documentation should be consulted for further details.

effects

If non-NULL, a data frame to substitute to the one produced by default by pdep_effects.

xlab

If non-NULL, a character string: X-axis label for the plot.

ylab

If non-NULL, a character string: Y-axis label for the plot.

ylim

The plot's ylim argument. Default is based on the (0.025,0.975) quantiles of the response.

rgb.args

Color control arguments, in the format produced by col2rgb.

add

Boolean: whether to add graphic elements of a previous plot produced by plot_effects

length.out

Numeric: for a numeric predictor variable, the number of values at which predictions are evaluated.

levels

If non-NULL, a character vector: for a factor predictor variable, the levels for which which predictions are evaluated.

intervals

Passed to predict.HLfit, whose documentation should be consulted for further details.

indiv

Boolean: whether to return all predictions given the values of other predictors in the newdata, or only their means.

…

Further arguments passed by plot_effects to pdep_effects, or by pdep_effects to predict.HLfit.

Value

For pdep_effects, a nested list, or a data frame storing values of the focal_var, average point predictions pointp and bounds low and up of intervals, depending on the indiv argument. When indiv is TRUE, each sublist contains vectors for pointp, low and up.

For plot_effects, the same value, returned invisibly.

References

J.H. Friedman (2001). Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29(5):1189-1232.

J. Friedman, T. Hastie and R. Tibshirani (2009) The Elements of Statistical Learning, 2nd ed. Springer.

Examples

Run this code

# NOT RUN {
data("scotlip")
hlcor <- HLCor(cases~I(prop.ag/10) +adjacency(1|gridcode)+offset(log(expec)),
           adjMatrix=Nmatrix,family=poisson(),data=scotlip) 
plot_effects(hlcor,focal_var="prop.ag",ylim=c(0,max(scotlip$cases)))  
points(cases~prop.ag, data=scotlip, col="blue",pch=20)
# }

Run the code above in your browser using DataLab