Learn R Programming

MLPUGS (version 0.2.0)

predict.ECC: Classify new samples using an Ensemble of Classifier Chains

Description

Uses a trained ECC and Gibbs sampling to predict labels for new samples. .f must return a matrix of probabilities, one row for each observation in newdata.

Usage

"predict"(object, newdata, n.iters = 300, burn.in = 100, thin = 2, run_parallel = FALSE, silent = TRUE, .f = NULL, ...)

Arguments

object
An object of type ECC returned by ecc().
newdata
A data frame or matrix of features. Must be the same form as the one used with ecc().
n.iters
Number of iterations of the Gibbs sampler.
burn.in
Number of iterations for adaptation (burn-in).
thin
Thinning interval.
run_parallel
Logical flag for utilizing multicore capabilities of the system.
silent
Logical flag indicating whether to have a progress bar (if the 'progress' package is installed) or print progress messages to console.
.f
User-supplied prediction function that corresponds to the type of classifier that was trained in the ecc() step. See Details.
...
additional arguments to pass to .f.

Value

An object of class PUGS containing:
  • y_labels : inherited from object
  • preds : A burnt-in, thinned multi-dimensional array of predictions.

Details

Getting the prediction function correct is very important here. Since this package is a wrapper that can use any classification algorithm as its base classifier, certain assumptions have been made. We assume that the prediction function can return a data.frame or matrix of probabilities with two columns: "0" and "1" because ecc() trains on a factor of "0"s and "1"s for more universal consistency.

Examples

Run this code
x <- movies_train[, -(1:3)]
y <- movies_train[, 1:3]

model_glm <- ecc(x, y, m = 1, .f = glm.fit, family = binomial(link = "logit"))

predictions_glm <- predict(model_glm, movies_test[, -(1:3)],
.f = function(glm_fit, newdata) {

  # Credit for writing the prediction function that works
  # with objects created through glm.fit goes to Thomas Lumley
  
  eta <- as.matrix(newdata) %*% glm_fit$coef
  output <- glm_fit$family$linkinv(eta)
  colnames(output) <- "1"
  return(output)
  
}, n.iters = 10, burn.in = 0, thin = 1)

## Not run: 
# 
# model_c50 <- ecc(x, y, .f = C50::C5.0)
# predictions_c50 <- predict(model_c50, movies_test[, -(1:3)],
#                            n.iters = 10, burn.in = 0, thin = 1,
#                            .f = C50::predict.C5.0, type = "prob")
#   
# model_rf <- ecc(x, y, .f = randomForest::randomForest)
# predictions_rf <- predict(model_rf, movies_test[, -(1:3)],
#                           n.iters = 1000, burn.in = 100, thin = 10,
#                           .f = function(rF, newdata) {
#                             randomForest:::predict.randomForest(rF, newdata, type = "prob")
#                           })
# ## End(Not run)

Run the code above in your browser using DataLab