Learn R Programming

textTinyR (version 1.1.2)

JACCARD_DICE: Jaccard or Dice similarity for text documents

Description

Jaccard or Dice similarity for text documents

Usage

JACCARD_DICE(token_list1 = NULL, token_list2 = NULL, method = "jaccard",
  threads = 1)

Arguments

token_list1

a list of tokenized text documents (it should have the same length as the token_list2)

token_list2

a list of tokenized text documents (it should have the same length as the token_list1)

method

a character string specifying the similarity metric. One of 'jaccard', 'dice'

threads

a numeric value specifying the number of cores to run in parallel

Value

a numeric vector

Details

The function calculates either the jaccard or the dice distance between pairs of tokenized text of two lists

Examples

Run this code
# NOT RUN {
library(textTinyR)

lst1 = list(c('use', 'this', 'function', 'to'), c('either', 'compute', 'the', 'jaccard'))

lst2 = list(c('or', 'the', 'dice', 'distance'), c('for', 'two', 'same', 'sized', 'lists'))

out = JACCARD_DICE(token_list1 = lst1, token_list2 = lst2, method = 'jaccard', threads = 1)
# }

Run the code above in your browser using DataLab