This is an underlying function for textstat_dist and
textstat_simil but returns TsparseMatrix.
textstat_proxy(x, y = NULL, margin = c("documents", "features"),
method = c("cosine", "correlation", "jaccard", "ejaccard", "dice",
"edice", "hamman", "simple matching", "euclidean", "chisquared",
"hamming", "kullback", "manhattan", "maximum", "canberra", "minkowski"),
p = 2, min_proxy = NULL, rank = NULL, use_na = FALSE)a dfm objects; y is an optional target matrix matching
x in the margin on which the similarity or distance will be computed.
if a dfm object is provided, proximity between documents or
features in x and y is computed.
identifies the margin of the dfm on which similarity or
difference will be computed: "documents" for documents or
"features" for word/term features.
character; the method identifying the similarity or distance measure to be used; see Details.
The power of the Minkowski distance.
the minimum proximity value to be recoded.
an integer value specifying top-n most proximity values to be recorded.
if TRUE, return NA for proximity to empty
vectors. Note that use of NA makes the proximity matrices denser.