Learn R Programming

SpeedReader (version 0.9.1)

ngram_sequence_matching: N-Gram Sequence Matching

Description

Calculates the positions of n-grams in two document versions which match an ngram in the other version.

Usage

ngram_sequence_matching(document_1, document_2, ngram_size,
  use_hashmap = FALSE, tokenized_strings_provided = FALSE)

Arguments

document_1

A string (or a character vector) representing the earlier document version.

document_2

A string (or a character vector) representing the later document version.

ngram_size

The length of n-grams to be compared

use_hashmap

Defaults to FALSE. If TRUE, then a hashmap is used for faster lookup and comparisons.

tokenized_strings_provided

Defaults to FALSE. If TRUE, then pre-tokenized strings are expected as character vectors.

Value

A List object.