Learn R Programming

quanteda.textmodels (version 0.9.9)

textmodel_svm: Linear SVM classifier for texts

Description

Fit a fast linear SVM classifier for texts, using the LiblineaR package.

Usage

textmodel_svm(
  x,
  y,
  weight = c("uniform", "docfreq", "termfreq"),
  type = 1,
  ...
)

Value

an object of class textmodel_svm, a list containing:

  • x, y, weights, type: argument values from the call parameters

  • algorithm character label of the algorithm used in the call to LiblineaR::LiblineaR()

  • classnames levels of y

  • bias the value of Bias returned from LiblineaR::LiblineaR()

  • svmlinfitted the fitted model object passed from the call to LiblineaR::LiblineaR()]

  • call the model call

Arguments

x

the dfm on which the model will be fit. Does not need to contain only the training documents.

y

vector of training labels associated with each document identified in train. (These will be converted to factors if not already factors.)

weight

weights for different classes for imbalanced training sets, passed to wi in LiblineaR::LiblineaR(). "uniform" uses default; "docfreq" weights by the number of training examples, and "termfreq" by the relative sizes of the training classes in terms of their total lengths in tokens.

type

argument passed to the type argument in LiblineaR::LiblineaR(); default is 1 for L2-regularized L2-loss support vector classification (dual)

...

additional arguments passed to LiblineaR::LiblineaR()

References

R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin. (2008) LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research 9: 1871-1874. https://www.csie.ntu.edu.tw/~cjlin/liblinear/.

See Also

LiblineaR::LiblineaR() predict.textmodel_svm()

Examples

Run this code
# use party leaders for govt and opposition classes
library("quanteda")
docvars(data_corpus_irishbudget2010, "govtopp") <-
    c(rep(NA, 4), "Gov", "Opp", NA, "Opp", NA, NA, NA, NA, NA, NA)
dfmat <- dfm(tokens(data_corpus_irishbudget2010))
tmod <- textmodel_svm(dfmat, y = dfmat$govtopp)
predict(tmod)

# multiclass problem - all party leaders
tmod2 <- textmodel_svm(dfmat,
    y = c(rep(NA, 3), "SF", "FF", "FG", NA, "LAB", NA, NA, "Green", rep(NA, 3)))
predict(tmod2)

Run the code above in your browser using DataLab