Learn R Programming

mlr3 (version 0.23.0)

mlr_measures_sim.jaccard: Jaccard Similarity Index

Description

Measure to compare two or more sets w.r.t. their similarity.

Arguments

Dictionary

This Measure can be instantiated via the dictionary mlr_measures or with the associated sugar function msr():

mlr_measures$get("sim.jaccard")
msr("sim.jaccard")

Meta Information

  • Type: "similarity"

  • Range: \([0, 1]\)

  • Minimize: FALSE

Details

For two sets \(A\) and \(B\), the Jaccard Index is defined as $$ J(A, B) = \frac{|A \cap B|}{|A \cup B|}. $$ If more than two sets are provided, the mean of all pairwise scores is calculated.

This measure is undefined if two or more sets are empty.

See Also

Dictionary of Measures: mlr_measures

as.data.table(mlr_measures) for a complete table of all (also dynamically created) Measure implementations.

Other similarity measures: mlr_measures_sim.phi