relaimpo calculates several relative importance metrics for the linear model.
The recommended metrics are lmg
(\(R^2\) partitioned by averaging over orders, like in Lindemann, Merenda and Gold (1980, p.119ff))
and pmvd
(a newly proposed metric by Feldman (2005), non-US version only).
For completeness, several other metrics are also on offer. Other packages with related topics: hier.part
, relimp
.
This package uses as an internal function the function nchoosek
from vsn, authored by Wolfgang Huber, available under LGPL.
Furthermore, it uses a modified version of the function carscore from care by Verena Zuber and Korbinian Strimmer.
Ulrike Groemping, BHT Berlin
lmg
and pmvd
are computer-intensive. Although they are calculated based on the
covariance matrix, which saves substantial computing time in comparison to carrying out actual regressions,
these methods still take quite long for problems with many regressors. Obviously,
this is particularly relevant in combination with bootstrapping.
relaimpo calculates the metrics and also offers the possibility of bootstrapping them and of displaying results in print and graphically.
It is possible to designate a subset of variables as adjustment variables that always stay in the model so that relative importance is only assessed among the remaining variables.
Models can have up to 2-way interactions that are treated hierarchically - i.e. an interaction is only allowed in a model that also contains all its main effects.
In case of interactions, only metric lmg
can be used.
Observations with missing values are by default excluded from the analysis for most functions.
The function mianalyze.relimp
allows to draw conclusions from a set of multiply imputed data sets.
This function is currently more restrictive than the rest of the package in terms of data types and models
that can be used (when summarizing the multiply imputed samples without calculating confidence intervals,
all possibilities available elsewhere are also applicable in mianalyze.relimp
).
relaimpo does accomodate complex survey designs by making use of the facilities in package survey. Currently, interactions and calculated variables cannot be combined with using a complex survey design in bootstrapping functions.
Chevan, A. and Sutherland, M. (1991) Hierarchical Partitioning. The American Statistician 45, 90--96.
Darlington, R.B. (1968) Multiple regression in psychological research and practice. Psychological Bulletin 69, 161--182.
Feldman, B. (2005) Relative Importance and Value. Manuscript (Version 1.1, March 19 2005), downloadable at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2255827
Genizi, A. (1993) Decomposition of R2 in multiple regression with correlated regressors. Statistica Sinica 3, 407--420. Downloadable at https://www3.stat.sinica.edu.tw/statistica/password.asp?vol=3&num=2&art=10
Groemping, U. (2006) Relative Importance for Linear Regression in R: The Package relaimpo Journal of Statistical Software 17, Issue 1. Downloadable at https://www.jstatsoft.org/v17/i01
Lindeman, R.H., Merenda, P.F. and Gold, R.Z. (1980) Introduction to Bivariate and Multivariate Analysis, Glenview IL: Scott, Foresman.
Zuber, V. and Strimmer, K. (2010) Variable importance and model selection by decorrelation. Preprint, downloadable at https://arxiv.org/abs/1007.5516
Go to https://prof.bht-berlin.de/groemping/relaimpo/ for further information and references.
calc.relimp
, booteval.relimp
, mianalyze.relimp
,
classesmethods.relaimpo
, package hier.part, package survey