Learn R Programming

fastNaiveBayes (version 1.1.2)

fastNaiveBayes.multinomial: Fast Naive Bayes Classifier with a Multinomial event model

Description

Extremely fast implementation of a Naive Bayes Classifier. This instance only uses the Multinomial event model for all columns.

Usage

fastNaiveBayes.multinomial(x, y, laplace = 0, sparse = FALSE, ...)

# S3 method for default fastNaiveBayes.multinomial(x, y, laplace = 0, sparse = FALSE, ...)

Arguments

x

a numeric matrix with frequency counts. A sparse dgcMatrix is also accepted

y

a factor of classes

laplace

A number used for Laplace smoothing. Default is 0

sparse

Use a sparse matrix? If true a sparse matrix will be constructed from x, which can give up to a 40 It's possible to directly feed a sparse dgcMatrix as x, which will set this parameter to TRUE

...

Not used.

Value

A fitted object of class "fastNaiveBayes.bernoulli". It has four components:

probability_table

Posterior probabilities

priors

calculated prior probabilities for each class

names

names of features used to train this fastNaiveBayes

Details

A Naive Bayes classifier that assumes independence between the feature variables. The multinomial distribution should be used when the features are the frequency that the feature occurs in each document.

By setting sparse = TRUE the numeric matrix x will be converted to a sparse dgcMatrix. This can be considerably faster in case few observations have a value different than 0.

It's also possible to directly supply a sparse dgcMatrix, which can be a lot faster in case a fastNaiveBayes model is trained multiple times on the same matrix or a subset of this. See examples for more details. Bear in mind that converting to a sparse matrix can actually be slower depending on the data.

See Also

predict.fastNaiveBayes.multinomial for the predict function for the fastNaiveBayes.multinomial class, fastNaiveBayes.mixed for the general fastNaiveBayes model, fastNaiveBayes.bernoulli for a Bernoulli distribution only model, and finally, fastNaiveBayes.gaussian for a Gaussian only distribution model.

Examples

Run this code
# NOT RUN {
rm(list = ls())
library(fastNaiveBayes)
cars <- mtcars
y <- as.factor(ifelse(cars$mpg > 25, "High", "Low"))
x <- cars[, 2:ncol(cars)]

dist <- fastNaiveBayes::fastNaiveBayes.detect_distribution(x, nrows = nrow(x))

# Multinomial only
vars <- c(dist$bernoulli, dist$multinomial)
newx <- x[, vars]

mod <- fastNaiveBayes.multinomial(newx, y, laplace = 1)
pred <- predict(mod, newdata = newx)
mean(pred != y)
# }

Run the code above in your browser using DataLab