Learn R Programming

SpeedReader (version 0.9.1)

multi_dice_coefficient_matching: Multiple N-Gram Lngth Dice Coefficient Document Matching

Description

Calculate N-Gram wise Dice coefficients for different N-Gram Lengths.

Usage

multi_dice_coefficient_matching(document_1, document_2, ngram_sizes = c(1:50),
  remove_duplicates = TRUE)

Arguments

document_1

A vector of strings (one per line or one per sentence), or a list of vectors of tokens (one per line or one per sentence).

document_2

Same as document_1, will be used for comparison.

ngram_sizes

A numeric vector of N-Gram lengths for us in calculating Dice coefficients.

remove_duplicates

Logical indicating whether dublicate ngrams should be removed before matching. Defaults to TRUE.

Value

A data.frame with Dice coefficients based on different N-Gram lengths.