eval.similarity.correlation: Evaluate DSM on Correlation with Similarity Ratings (wordspace)

Description

Performs evaluation by comparing the distances (or similarities) computed by a DSM with (typically human) word similarity ratings. Well-know examples are the noun pair ratings collected by Rubenstein & Goodenough (1965; RG65) and Finkelstein et al. (2002; WordSim353).

The quality of the DSM predictions is measured by Spearman rank correlation \(rho\).

Usage

eval.similarity.correlation(task, M, dist.fnc=pair.distances,
                            details=FALSE, format=NA, taskname=NA,
                            word1.name="word1", word2.name="word2", score.name="score",
                            ...)

Value

The default short report (details=FALSE) is a data frame with a single row and the following columns:

rho: (absolute value of) Spearman rank correlation coefficient \(rho\)
p.value: p-value indicating evidence for a significant correlation
missing: number of pairs not included in the DSM
r: (absolute value of) Pearson correlation coefficient \(r\)
r.lower: lower bound of confidence interval for Pearson correlation
r.upper: upper bound of confidence interval for Pearson correlation

The detailed report (details=TRUE) is a copy of the original task data with two additional columns:

distance: distance calculated by the DSM for each word pair, possibly transformed (numeric)
missing: whether word pair is missing from the DSM (logical)

In addition, the short report is appended to the data frame as an attribute "eval.result", and the optional taskname value as attribute "taskname". The data frame is marked as an object of class eval.similarity.correlation, for which suitable print

and plot methods are defined.

Arguments

task: a data frame containing word pairs (usually in columns word1 and word2) with similarity ratings (usually in column score); any other columns will be ignored
M: a scored DSM matrix, passed to dist.fnc
dist.fnc: a callback function used to compute distances or similarities between word pairs. It will be invoked with character vectors containing the components of the word pairs as first and second argument, the DSM matrix M as third argument, plus any additional arguments (...) passed to eval.similarity.correlation. The return value must be a numeric vector of appropriate length. If one of the words in a pair is not represented in the DSM, the corresponding distance value should be set to Inf (or -Inf in the case of similarities).
details: if TRUE, a detailed report with information on each task item is returned (see Value below for details)
format: if the task definition specifies POS-disambiguated lemmas in CWB/Penn format, they can automatically be transformed into some other notation conventions; see convert.lemma for details
taskname: optional row label for the short report (details=FALSE)
...: any further arguments are passed to dist.fnc and can be used e.g. to select a distance measure
word1.name: the name of the column of task containing the first word of each pair
word2.name: the name of the column of task containing the second word of each pair
score.name: the name of the column of task containing the corresponding similarity ratings

Author

Stephanie Evert (https://purl.org/stephanie.evert)

Details

DSM distances are computed for all word pairs and compared with similarity ratings from the gold standard. As an evaluation criterion, Spearman rank correlation between the DSM and gold standard scores is computed. The function also reports a confidence interval for Pearson correlation, which might require suitable transformation to ensure a near-linear relationship in order to be meaningful.

NB: Since the correlation between similarity ratings and DSM distances will usually be negative, the evaluation report omits minus signs on the correlation coefficients.

With the default dist.fnc, the distance values can optionally be transformed through an arbitrary function specified in the transform argument (see pair.distances for details). Examples include transform=log (esp. for neighbour rank as a distance measure) and transform=function (x) 1/(1+x) (in order to transform distances into similarities). Note that Spearman rank correlation is not affected by any monotonic transformation, so the main evaluation results will remain unchanged.

If one or both words of a pair are not found in the DSM, the distance is set to a fixed value 10% above the maximum of all other DSM distances, or 10% below the minimum in the case of similarity values. This is done in order to avoid numerical and visualization problems with Inf values; the particular value used does not affect the rank correlation coefficient.

With the default dist.fnc callback, additional arguments method and p can be used to select a distance measure (see dist.matrix for details); rank=TRUE can be specified in order to use neighbour rank as a measure of semantic distance.

References

Finkelstein, Lev, Gabrilovich, Evgeniy, Matias, Yossi, Rivlin, Ehud, Solan, Zach, Wolfman, Gadi, and Ruppin, Eytan (2002). Placing search in context: The concept revisited. ACM Transactions on Information Systems, 20(1), 116--131.

Rubenstein, Herbert and Goodenough, John B. (1965). Contextual correlates of synonymy. Communications of the ACM, 8(10), 627--633.

Examples

Run this code


eval.similarity.correlation(RG65, DSM_Vectors)

if (FALSE) {
plot(eval.similarity.correlation(RG65, DSM_Vectors, details=TRUE))
}

Run the code above in your browser using DataLab