summary.TDboost: Summary of a TDboost object

Description

Computes the relative influence of each variable in the TDboost object.

Usage

# S3 method for TDboost
summary(object,
        cBars=length(object$var.names),
        n.trees=object$n.trees,
        plotit=TRUE,
        order=TRUE,
        method=relative.influence,
        normalize=TRUE,
        ...)

Value

Returns a data frame where the first component is the variable name and the second is the computed relative influence, normalized to sum to 100.

Arguments

object: a TDboost object created from an initial call to TDboost.
cBars: the number of bars to plot. If order=TRUE the only the variables with the cBars largest relative influence will appear in the barplot. If order=FALSE then the first cBars variables will appear in the plot. In either case, the function will return the relative influence of all of the variables.
n.trees: the number of trees used to generate the plot. Only the first n.trees trees will be used.
plotit: an indicator as to whether the plot is generated.
order: an indicator as to whether the plotted and/or returned relative influences are sorted.
method: The function used to compute the relative influence. relative.influence is the default and is the same as that described in Friedman (2001). The other current (and experimental) choice is permutation.test.TDboost. This method randomly permutes each predictor variable at a time and computes the associated reduction in predictive performance. This is similar to the variable importance measures Breiman uses for random forests, but TDboost currently computes using the entire training dataset (not the out-of-bag observations.
normalize: if FALSE then summary.TDboost returns the unnormalized influence.
...: other arguments passed to the plot function.

Author

Yi Yang yi.yang6@mcgill.ca, Wei Qian wxqsma@rit.edu and Hui Zou hzou@stat.umn.edu

Details

This returns the reduction attributable to each variable in sum of squared error in predicting the gradient on each iteration. It describes the relative influence of each variable in reducing the loss function. See the references below for exact details on the computation.

References

Yang, Y., Qian, W. and Zou, H. (2013), “A Boosted Tweedie Compound Poisson Model for Insurance Premium” Preprint.

G. Ridgeway (1999). “The state of boosting,” Computing Science and Statistics 31:172-181.

J.H. Friedman (2001). "Greedy Function Approximation: A Gradient Boosting Machine," Annals of Statistics 29(5):1189-1232.