Returns semantic neighborhood with semantic neighborhood size and density
SND(x,n=NA,threshold=3.5,tvectors=tvectors)
A list of three elements:
neighbors: A names numeric vector of all identified neighbors, with the names being these neighbors and the values their similarity to x
n_size: The number of neighbors as a numeric
SND: The semantic neighborhood density (SND) as a numeric
a character vector of length(x) = 1
or a numeric of length=ncol(tvectors)
vector with same dimensionality as the semantic space
if specified as a numeric, determines the size of the neighborhood as the n
nearest words to x
. If n=NA
(default), the semantic neighborhood will be determined according to a similarity threshold (see threshold
)
specifies the similarity threshold that determines if a word is counted as a neighbor for x
, following the method by Buchanan et al. (2011) (see Description
below)
the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)
Fritz Guenther
There are two principle approaches to determine the semantic neighborhood of a target word:
Set an a priori size of the semantic neighborhood to a fixed value n
(e.g., Marelli & Baroni, 2015). The n
closest words to the target word are counted as its semantic neighbors. The semantic neighborhood size is then necessarily n
; the semantic neighborhood density is the mean similarity between these neighbors and the target word (see also plausibility
)
Determine the semantic neighborhood based on a similarity threshold; all words whose similarity to the target word exceeds this threshold are counted as its semantic neighbors (e.g., Buchanan, Westbury, & Burgess, 2001). First, the similarity between the target word and all words in the semantic space is computed. These similarities are then transformed into z-scores. Traditionally, the threshold is set to z = 3.5 (e.g., Buchanan, Westbury, & Burgess, 2001).
If a single target word is used as x
, this target word itself (which always has a similarity of 1 to itself) is excluded from these computations so that it cannot be counted as its own neighbor
Buchanan, L., Westbury, C., & Burgess, C. (2001). Characterizing semantic space: Neighborhood effects in word recognition. Psychonomic Bulletin & Review, 8, 531-544.
Marelli, M., & Baroni, M. (2015). Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122, 485-515.
cosine
,
plot_neighbors
,
compose
data(wonderland)
SND("cheshire",n=20,tvectors=wonderland)
SND("alice",threshold=2,tvectors=wonderland)
Run the code above in your browser using DataLab