Compute recurrence quantification on text.
text_rqa(rsrc,typ = 'file',removeStopwords = F,embed = 1,tw = 1,limit = -1,shuffle = F)
Location of file or resource, or string literal
A flag indicating the type of resource file in input: typ = "file" (it's a file name); typ = "ulr" (it's a url, and the file gets downloaded); typ = "string" or "raw_chars" (it's a literal string); typ = "tibble" (it's a text formatted as tidytext by tibble)
A boolean: TRUE (remove stop words) - FALSE (it retains them)
The number of embedding dimension for phase-reconstruction, i.e., the lag intervals.
The Theiler window parameter
A scalar indicating how much text should be considered for the analysis
A boolean: if TRUE, it randomly shuffles the order of the text for surrogate analyses.
It returns a list with different measures extracted from the recurrence plot. Otherwise, the values for the output arguments will be either 0 or NA.
The percentage of recurrent points falling within the specified radius (range between 0 and 100)
Proportion of recurrent points forming diagonal line structures.
The total number of lines in the recurrent plot
The length of the longest diagonal line segment in the plot, excluding the main diagonal
The average length of line structures
Shannon information entropy of diagonal line lengths longer than the minimum length
Entropy measure normalized by the number of lines observed in the plot. Handy to compare across contexts and conditions
Proportion of recurrent points forming vertical line structures
The average length of vertical line structures
A wrapper to the `crqa()` function that runs recurrence quantification analysis on text. This function also calls `get_text_series()` to simplify the text in case such simplification was not done before inputting the text.
# NOT RUN {
txt = "here is a raw raw raw string, literally"
res = text_rqa(txt,typ = "string")
plot_rp(res$RP)
# }
Run the code above in your browser using DataLab