textFeatures: Extract text features for authorship analysis

Description

This function combines several of koRpus' methods to extract the 9-Feature Set for authorship detection (Brannon, Afroz & Greenstadt, 2011; Brannon & Greenstadt, 2009).

Usage

textFeatures(text, hyphen = NULL)

Arguments

text

An object of class kRp.tagged-class, kRp.txt.freq-class or kRp.a

hyphen

An object of class kRp.hyphen-class, if text has already been hyphenated. If text is a list and hyphen is not NULL, it must also be a

Value

A data.frame: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

References

Brennan, M., Afroz, S., & Greenstadt, R. (2011). Deceiving authorship detection. Presentation at 28th Chaos Communication Congress (28C3), Berlin, Germany. Brennan, M. & Greenstadt,R. (2009). Practical Attacks Against Authorship Recognition Techniques. In Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence (IAAI), Pasadena, CA. Tweedie, F.J., Singh, S., & Holmes, D.I. (1996). Neural Network Applications in Stylometry: The Federalist Papers. Computers and the Humanities, 30, 1--10.

Examples

Run this code

set.kRp.env(TT.cmd="manual", lang="en", TT.options=list(path="~/bin/treetagger", preset="en"))
tagged.txt <- treetag("example_text.txt")
tagged.txt.features <- textFeatures(tagged.txt)

Run the code above in your browser using DataLab