predict.maxent: predicts the expected label of a document given a trained model.

Description

Predicts the expected labels and probability scores of a matrix of documents given a trained model of class maxent-class generated by function maxent.

Usage

"predict"(object, feature_matrix, ...)

Arguments

object

An object of class maxent-class, as returned by the maxent function.

feature_matrix

Either a regular matrix of class DocumentTermMatrix or TermDocumentMatrix from package tm, a matrix.csr representation generated by as.compressed.matrix, Matrix (package Matrix), matrix.csr (SparseM), data.frame, or matrix.

...

Not used but needed for compatibility with generic predict method.

Value

Returns a matrix with the first column containing predicted labels, and the remaining columns containing probability scores for each unique label.

References

Y. Tsuruoka. "A simple C++ library for maximum entropy classification." University of Tokyo Department of Computer Science (Tsujii Laboratory), 2011. URL http://www-tsujii.is.s.u-tokyo.ac.jp/~tsuruoka/maxent/.

Examples

Run this code

# LOAD LIBRARY
library(maxent)

# READ THE DATA, PREPARE THE CORPUS, and CREATE THE MATRIX
data <- read.csv(system.file("data/NYTimes.csv.gz",package="maxent"))
corpus <- Corpus(VectorSource(data$Title[1:150]))
matrix <- DocumentTermMatrix(corpus)

# TRAIN/PREDICT USING SPARSEM REPRESENTATION
sparse <- as.compressed.matrix(matrix)
model <- maxent(sparse[1:100,],as.factor(data$Topic.Code)[1:100])
results <- predict(model,sparse[101:150,])

Run the code above in your browser using DataLab