Simhash worker uses the keyword extraction worker to find the keywords
and uses simhash algorithm to compute simhash. dicthmm, idf and stop_word should be provided when initializing
jiebaR worker.
Usage
simhash(code, jiebar)
vector_simhash(code, jiebar)
Arguments
code
For simhash, a Chinese sentence or the path of a text file.
For vector_simhash, a character vector of segmented words.
jiebar
jiebaR Worker.
Details
There is a symbol <= for this function.
References
MS Charikar - Similarity Estimation Techniques from Rounding Algorithms