ReBaggRegress: REBaggRegress: RE(sampled) BAG(ging), an ensemble method for dealing with imbalanced regression problems.

Description

This function handles imbalanced regression problems by learneing a special purpose bagging ensemble. A number of weak learners selected by the user are trained on resamples of the training data provided. The resamples are built taking into consideration the imbalance of the problem. Currently, there are 4 different methods for building the resamples used.

Usage

ReBaggRegress(form, train, rel="auto", thr.rel, learner, learner.pars,
       nmodels, samp.method = "variationSMT", aggregation="Average", quiet=TRUE)

Value

The function returns an object of class BagModel.

Arguments

form: A formula describing the prediction problem.
train: A data frame containing the training (imbalanced) data set.
rel: The relevance function which can be automatically ("auto") determined (the default) or may be provided by the user through a matrix.
thr.rel: A number indicating the relevance threshold above which a case is considered as belonging to the rare "class".
learner: The learning algorithm to be used as weak learner.
learner.pars: A named list with the learner parameters.
nmodels: A numeric indicating the number of models to train.
samp.method: A character specifying the method used for building the resamples of the training set provided. Possible characters are: "balance", "variation", "balanceSMT", "variationSMT". The "balance" methods builds a number (nmodels) of samples that use all the rare cases and the same nr of normal cases. The "variation" method build a number of baggs with all the rare cases and varying percentages of normal cases. The SMT sufix is used when the SmoteR strategy is used to generate the new examples. Defaults to "variationSMT".
aggregation: charater specifying the method used for aggregating the results obtained by the individual learners. For now, the only method available is by averaging the models predictions.
quiet: logical specifying if development should be shown or not. Defaults to TRUE

Author

Paula Branco paobranco@gmail.com, Rita Ribeiro rpribeiro@dcc.fc.up.pt and Luis Torgo ltorgo@dcc.fc.up.pt

References

Branco, P. and Torgo, L. and Ribeiro, R.P. (2018) REBAGG: REsampled BAGGing for Imbalanced Regression LIDTA2018: 2nd International Workshop on Learning with Imbalanced Domains: Theory and Applications (Co-located with ECML/PKDD 2018) Dublin, Ireland