bm25

BM25 stands for Best Matching 25. It is widely using for ranking documents and a preferred method than TF*IDF scores.
It is used to find the similar documents from a corpus, given a new document. It is popularly used in information retrieval systems.
This implementation uses multiple cores for faster and parallel computation.

datasets

The idea is to provide a standard interface
to users who use both R and Python for building machine learning models.
This package provides a scikit-learn's fit, predict interface to
train machine learning models in R.

Manish Saraswat

superml

Build Machine Learning Models Like Using Python's Scikit-Learn
Library in R

bm25 function

<code><a rd-options="" href="/link/R6Class?package=superml&version=0.4.0" data-mini-rdoc="superml::R6Class">R6Class</a></code> object.

Format

For usage details see Methods, Arguments and Examples sections.<pre>
bm25 = bm25$new(corpus, n_cores)
bm25$most_similar(input_document, topn)
bm25$compute(input_document)
</pre>

Usage

<dl class="dl-horizontal">
 <dt><code>$new()</code></dt><dd>Initialise the instance of the class. Here you pass the complete corpus of the documents</dd>
 <dt><code>$most_similar()</code></dt><dd>it returns the topn most similar documents from the corpus</dd>
 <dt><code>$compute()</code></dt><dd>it returns a similarity score for all the documents in the corpus, given a sentence</dd>
</dl>

Methods

<dl class="dl-horizontal">
 <dt>corpus</dt><dd>a list containing sentences</dd>
 <dt>use_parallel</dt><dd>boolean value used to activate parallel computation, defaults to FALSE</dd>
</dl>

Arguments

Best Matching(BM25) — bm25

<code><a rd-options='' href='R6Class'>R6Class</a></code> object.

bm25: Best Matching(BM25)

Description

Usage

Format

Usage

Methods

Arguments

Examples