Learn R Programming

quanteda.textmodels (version 0.9.9)

textmodel_affinity: Class affinity maximum likelihood text scaling model

Description

textmodel_affinity() implements the maximum likelihood supervised text scaling method described in Perry and Benoit (2017).

Usage

textmodel_affinity(
  x,
  y,
  exclude = NULL,
  smooth = 0.5,
  ref_smooth = 0.5,
  verbose = quanteda_options("verbose")
)

Value

A textmodel_affinity class list object, with elements:

  • smooth a numeric vector of length two for the smoothing parameters smooth and ref_smooth x the input model matrix x y the vector of class training labels y p a feature \(\times\) class sparse matrix of estimated class affinities

  • support logical vector indicating whether a feature was included in computing class affinities

  • call the model call

Arguments

x

the dfm or bootstrap_dfm object on which the model will be fit. Does not need to contain only the training documents, since the index of these will be matched automatically.

y

vector of training classes/scores associated with each document identified in data

exclude

a set of words to exclude from the model

smooth

a smoothing parameter for class affinities; defaults to 0.5 (Jeffreys prior). A plausible alternative would be 1.0 (Laplace prior).

ref_smooth

a smoothing parameter for token distributions; defaults to 0.5

verbose

logical; if TRUE print diagnostic information during fitting.

Author

Patrick Perry and Kenneth Benoit

References

Perry, P.O. & Benoit, K.R. (2017). Scaling Text with the Class Affinity Model. tools:::Rd_expr_doi("https://doi.org/10.48550/arXiv.1710.08963").

See Also

predict.textmodel_affinity() for methods of applying a fitted textmodel_affinity() model object to predict quantities from (other) documents.

Examples

Run this code
(af <- textmodel_affinity(quanteda::data_dfm_lbgexample, y = c("L", NA, NA, NA, "R", NA)))
predict(af)
predict(af, newdata = quanteda::data_dfm_lbgexample[6, ])

if (FALSE) {
# compute bootstrapped SEs
dfmat <- quanteda::bootstrap_dfm(data_corpus_dailnoconf1991, n = 10, remove_punct = TRUE)
textmodel_affinity(dfmat, y = c("Govt", "Opp", "Opp", rep(NA, 55)))
}

Run the code above in your browser using DataLab