text_rqa: Recurrence quantification analysis on categorical series of text

Description

Compute recurrence quantification on text.

Usage

text_rqa(rsrc,typ = 'file',removeStopwords = F,embed = 1,tw = 1,limit = -1,shuffle = F)

Arguments

rsrc

Location of file or resource, or string literal

typ

A flag indicating the type of resource file in input: typ = "file" (it's a file name); typ = "ulr" (it's a url, and the file gets downloaded); typ = "string" or "raw_chars" (it's a literal string); typ = "tibble" (it's a text formatted as tidytext by tibble)

removeStopwords

A boolean: TRUE (remove stop words) - FALSE (it retains them)

embed

The number of embedding dimension for phase-reconstruction, i.e., the lag intervals.

The Theiler window parameter

limit

A scalar indicating how much text should be considered for the analysis

shuffle

A boolean: if TRUE, it randomly shuffles the order of the text for surrogate analyses.

Value

It returns a list with different measures extracted from the recurrence plot. Otherwise, the values for the output arguments will be either 0 or NA.

The percentage of recurrent points falling within the specified radius (range between 0 and 100)

DET

Proportion of recurrent points forming diagonal line structures.

NRLINE

The total number of lines in the recurrent plot

maxL

The length of the longest diagonal line segment in the plot, excluding the main diagonal

The average length of line structures

ENTR

Shannon information entropy of diagonal line lengths longer than the minimum length

rENTR

Entropy measure normalized by the number of lines observed in the plot. Handy to compare across contexts and conditions

LAM

Proportion of recurrent points forming vertical line structures

The average length of vertical line structures

Details

A wrapper to the `crqa()` function that runs recurrence quantification analysis on text. This function also calls `get_text_series()` to simplify the text in case such simplification was not done before inputting the text.

Examples

Run this code

# NOT RUN {
txt = "here is a raw raw raw string, literally"
res = text_rqa(txt,typ = "string")
plot_rp(res$RP)


# }

Run the code above in your browser using DataLab