Compute LexRanks from sentence pair similarities using the page rank algorithm or degree centrality the methods used to compute lexRank are discussed in "LexRank: Graph-based Lexical Centrality as Salience in Text Summarization."
lexRankFromSimil(s1, s2, simil, threshold = 0.2, n = 3,
returnTies = TRUE, usePageRank = TRUE, damping = 0.85,
continuous = FALSE)
A character vector of sentence IDs corresponding to the s2
and simil
arguments
A character vector of sentence IDs corresponding to the s1
and simil
arguments
A numeric vector of similarity values that represents the similarity between the sentences represented by the IDs in s1
and s2
.
The minimum simil value a sentence pair must have to be represented in the graph where lexRank is calculated.
The number of sentences to return as the extractive summary. The function will return the top n
lexRanked sentences. See returnTies
for handling ties in lexRank.
TRUE
or FALSE
indicating whether or not to return greater than n
sentence IDs if there is a tie in lexRank. If TRUE
, the returned number of sentences will not be limited to n
, but rather will return every sentence with a top 3 score. If FALSE
, the returned number of sentences will be <=n
. Defaults to TRUE
.
TRUE
or FALSE
indicating whether or not to use the page rank algorithm for ranking sentences. If FALSE
, a sentences unweighted centrality will be used as the rank. Defaults to TRUE
.
The damping factor to be passed to page rank algorithm. Ignored if usePageRank
is FALSE
.
TRUE
or FALSE
indicating whether or not to use continuous LexRank. Only applies if usePageRank==TRUE
. If TRUE
, threshold
will be ignored and lexRank will be computed using a weighted graph representation of the sentences. Defaults to FALSE
.
A 2 column dataframe with columns sentenceId
and value
. sentenceId
contains the ids of the top n
sentences in descending order by value
. value
contains page rank score (if usePageRank==TRUE
) or degree centrality (if usePageRank==FALSE
).
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html
# NOT RUN {
lexRankFromSimil(s1=c("d1_1","d1_1","d1_2"), s2=c("d1_2","d2_1","d2_1"), simil=c(.01,.03,.5))
# }
Run the code above in your browser using DataLab