Learn R Programming

SparkR (version 2.1.2)

spark.naiveBayes: Naive Bayes Models

Description

spark.naiveBayes fits a Bernoulli naive Bayes model against a SparkDataFrame. Users can call summary to print a summary of the fitted model, predict to make predictions on new data, and write.ml/read.ml to save/load fitted models. Only categorical data is supported.

Usage

spark.naiveBayes(data, formula, ...)

# S4 method for NaiveBayesModel predict(object, newData)

# S4 method for NaiveBayesModel summary(object)

# S4 method for SparkDataFrame,formula spark.naiveBayes(data, formula, smoothing = 1)

# S4 method for NaiveBayesModel,character write.ml(object, path, overwrite = FALSE)

Arguments

data

a SparkDataFrame of observations and labels for model fitting.

formula

a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

...

additional argument(s) passed to the method. Currently only smoothing.

object

a naive Bayes model fitted by spark.naiveBayes.

newData

a SparkDataFrame for testing.

smoothing

smoothing parameter.

path

the directory where the model is saved.

overwrite

overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists.

Value

predict returns a SparkDataFrame containing predicted labeled in a column named "prediction".

summary returns summary information of the fitted model, which is a list. The list includes apriori (the label distribution) and tables (conditional probabilities given the target label).

spark.naiveBayes returns a fitted naive Bayes model.

See Also

e1071: https://cran.r-project.org/package=e1071

write.ml

Examples

Run this code
# NOT RUN {
data <- as.data.frame(UCBAdmissions)
df <- createDataFrame(data)

# fit a Bernoulli naive Bayes model
model <- spark.naiveBayes(df, Admit ~ Gender + Dept, smoothing = 0)

# get the summary of the model
summary(model)

# make predictions
predictions <- predict(model, df)

# save and load the model
path <- "path/to/model"
write.ml(model, path)
savedModel <- read.ml(path)
summary(savedModel)
# }

Run the code above in your browser using DataLab