Learn R Programming

LSAfun (version 0.8.1)

centroid_analysis: Centroid Analysis

Description

Performs a centroid analysis for a set of words

Usage

centroid_analysis(responses,targets = NULL,split=" ",unique.responses = FALSE,
reference.list = NULL,verbose = FALSE,rank.responses = FALSE,
tvectors=tvectors)

Value

An object of class centroid_analysis. This object is a list consisting of:

$centroid

The centroid of the response vectors

$cosines

The cosine similarity between the response centroid and each target vector

$ranks.target

The rank of the response centroid in the neighborhood of each target vector, with reference to reference.list

$ranks.centroid

The rank of each target in the neighborhood of the response centroid, with reference to reference.list

Arguments

responses

a character vector specifying multiple single words

targets

(optional:) a character vector specifying one or multiple single words

split

a character vector defining the character used to split the input strings into individual words (white space by default)

unique.responses

If TRUE, duplicated words in responses are discarded when computing the the centroid. FALSE by default, so multiple instances of the same word will be included.

reference.list

(optional:) A list of words in reference to which the neighborhood ranks are computed: Only entries in reference.list will be considered as possible neighbors. Only relevant when target words are provided in target. if reference.list = NULL (default), then rownames(tvectors) (all words in the semantic space) will be considered when computing ranks.

verbose

If TRUE (default: FALSE), a message will appear that specifies for which target the neighborhood ranks are currently being computed

rank.responses

If FALSE (default), responses themselves will not be considered for computing the neighborhood rank.

tvectors

the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

Author

Fritz Guenther, Aliona Petrenco

Details

The centroid analysis computes the average vector for a set of words. The intended use case is that these words are responses towards a given concept; the centroid then serves as the estimated vector representation for that concept.

References

Pugacheva, V., & Günther, F. (2024). Lexical choice and word formation in a taboo game paradigm. Journal of Memory and Language, 135, 104477.

See Also

cosine, Cosine, neighbors

Examples

Run this code
data(wonderland)
centroid_analysis(responses=c("mouse","rabbit","cat","king","queen"),targets=c("alice","hare"),
          tvectors=wonderland)

Run the code above in your browser using DataLab