Computes relevance statistics for each input coordinate by calculating their particle-averaged mean reduction in variance each time that coordinate is used as a splitting variable in (an internal node of) the tree(s)
relevance.dynaTree(object, rect = NULL, categ = NULL,
approx = FALSE, verb = 0)
The entire object
is returned with a new entry called
relevance
containing a matrix
with ncol(X)
columns. Each row contains the sample from the relevance of each input, and there is a row for each particle
a "dynaTree"
-class object built by dynaTree
an optional matrix
with two columns and
ncol(object$X)
rows describing the bounding rectangle
for the ALC integration; the
default that is used when rect = NULL
is the bounding
rectangle obtained by applying range
to each
column of object$X
(taking care to remove the
first/intercept column of object$X
if icept =
"augmented"
A vector of logicals of length ncol(object$X)
indicating
which, if any, dimensions of the input space should be treated
as categorical; the default categ
argument is NULL
meaning that the categorical inputs
are derived from object$X
in a sensible way
a scalar logical indicating if the count of the number of data points in the leaf should be used in place of its area; this can help with numerical accuracy in high dimensional input spaces
a positive scalar integer indicating how many particles should
be processed (iterations) before a progress statement should be
printed to the console; a (default) value of verb = 0
is quiet
Robert B. Gramacy rbg@vt.edu,
Matt Taddy and Christoforos Anagnostopoulos
Each binary split in the tree (in each particle) emits a reduction in variance (for regression models) or a reduction in entropy (for classification). This function calculates these reductions and attributes them to the variable(s) involved in the split(s). Those with the largest relevances are the most useful for prediction. A sensible variable selection rule based on these relevances is to discard those variables whose median relevance is not positive. See the Gramacy, Taddy, & Wild (2011) reference below for more details.
The new set of particles is appended to the old set. However
after a subsequent update.dynaTree
call the total
number of particles reverts to the original amount.
Note that this does not work well with dynaTree
objects
which were built with model="linear"
. Rather, a full
sensitivity analysis (sens.dynaTree
) is needed. Usually
it is best to first do model="constant"
and then use
relevance.dynaTree
. Bayes factors (getBF
)
can be used to back up any variable selections implied by the
relevance. Then, if desired, one can re-fit on the new (possibly
reduced) set of predictors with model="linear"
.
There are no caveats with model="class"
Gramacy, R.B., Taddy, M.A., and S. Wild (2011). “Variable Selection and Sensitivity Analysis via Dynamic Trees with an Application to Computer Code Performance Tuning” arXiv:1108.4739
dynaTree
, sens.dynaTree
,
predict.dynaTree
varpropuse
, varproptotal
## see the examples in sens.dynaTree for the relevances;
## Also see varpropuse and the class2d demo via
## demo("class2d")
Run the code above in your browser using DataLab