predict.fastNaiveBayes.multinomial: Predict Method for fastNaiveBayes.multinomial fits

Description

Uses a fastNaiveBayes.multinomial model and a new data set to create the classifications. This can either be the raw probabilities generated by the fastNaiveBayes.multinomial model or the classes themselves.

Usage

# S3 method for fastNaiveBayes.multinomial
predict(object, newdata,
  type = c("class", "raw", "rawprob"), sparse = FALSE,
  threshold = .Machine$double.eps, ...)

Arguments

object

A fitted object of class "fastNaiveBayes.multinomial".

newdata

A numeric matrix with 1's and 0's to indicate the presence or absence of features. A Sparse dgcMatrix is also accepted. Note that if newdata contains features that were not encountered in the training data, these are omitted from the prediction. Furthermore, newdata can contain fewer features than encountered in the training data. In this case, newdata will be padded with extra columns all filled with 0's.

type

If "raw", the conditional a-posterior probabilities for each class are returned, and the class with maximal probability else.

sparse

Use a sparse Matrix? If true a sparse matrix will be constructed from x, which can give up to a 40% speed up. It's possible to directly feed a sparse dgcMatrix as x, which will set this parameter to TRUE

threshold

A threshold for the minimum probability. For Bernoulli and Multinomial event models Laplace smoothing solves this, but in the case of Gaussian event models, this ensures numerical probabilities

...

Not used.

Value

If type = 'class', a factor with classified class levels. If type = 'raw', a matrix with the predicted probabilities of each class, where each column in the matrix corresponds to a class level.

Details

In the extremely unlikely case that two classes have the exact same estimated probability, the first encountered class is used as the classification and a warning is issued.

Using a sparse matrix directly can be especially useful if it's necessary to use predict multiple times on the same matrix or on different subselections of the same initial matrix, see examples for further details.

Examples

Run this code

# NOT RUN {
rm(list = ls())
library(fastNaiveBayes)

cars <- mtcars
y <- as.factor(ifelse(cars$mpg > 25, "High", "Low"))
x <- cars[, 2:ncol(cars)]

dist <- fastNaiveBayes::fastNaiveBayes.detect_distribution(x, nrows = nrow(x))

# Multinomial only
vars <- c(dist$bernoulli, dist$multinomial)
newx <- x[, vars]
mod <- fastNaiveBayes.multinomial(newx, y, laplace = 1)
pred <- predict(mod, newdata = newx)
mean(pred != y)
# }

Run the code above in your browser using DataLab