This is an underlying function for textstat_dist
and
textstat_simil
but returns TsparseMatrix
.
textstat_proxy(x, selection = NULL, margin = c("documents",
"features"), method = c("cosine", "correlation", "jaccard", "ejaccard",
"dice", "edice", "hamman", "simple matching", "faith", "euclidean",
"chisquared", "hamming", "kullback", "manhattan", "maximum", "canberra",
"minkowski"), p = 2, min_proxy = NULL, rank = NULL)
a dfm object
a valid index for document or feature names from x
,
to be selected for comparison
identifies the margin of the dfm on which similarity or
difference will be computed: "documents"
for documents or
"features"
for word/term features.
method the similarity or distance measure to be used; see Details.
The power of the Minkowski distance.
the minimum proximity value to be recoded.
an integer value specifying top-n most proximity values to be recorded.