exclusivity: Exclusivity

Description

Calculate an exclusivity metric for an STM model.

Usage

exclusivity(model, M = 10, frexw = 0.7)

Arguments

model

the STM object

the number of top words to consider per topic

frexw

the frex weight

Value

a numeric vector containing exclusivity for each topic

Details

In Roberts et al 2014 we proposed using the Mimno et al 2011 semanticCoherence metric for helping with topic model selection. We found that semantic coherence alone is relatively easy to achieve by having only a couple of topics which all are dominated by the most common words. Thus we also proposed an exclusivity measure.

Our exclusivity measure includes some information on word frequency as well. It is based on the FREX labeling metric (calcfrex) with the weight set to .7 in favor of exclusivity by default.

This function is currently marked with the keyword internal because it does not have much error checking.

References

Mimno, D., Wallach, H. M., Talley, E., Leenders, M., & McCallum, A. (2011, July). "Optimizing semantic coherence in topic models." In Proceedings of the Conference on Empirical Methods in Natural Language Processing (pp. 262-272). Association for Computational Linguistics. Chicago

Bischof and Airoldi (2012) "Summarizing topical content with word frequency and exclusivity" In Proceedings of the International Conference on Machine Learning.

Roberts, M., Stewart, B., Tingley, D., Lucas, C., Leder-Luis, J., Gadarian, S., Albertson, B., et al. (2014). "Structural topic models for open ended survey responses." American Journal of Political Science, 58(4), 1064-1082. http://goo.gl/0x0tHJ

Examples

Run this code

# NOT RUN {
exclusivity(gadarianFit)
# }

Run the code above in your browser using DataLab