Learn R Programming

PsychWordVec

Word Embedding Research Framework for Psychological Science.

An integrative toolbox of word embedding research that provides:

  1. A collection of pre-trained static word vectors in the .RData compressed format.
  2. A group of functions to process, analyze, and visualize word vectors.
  3. A range of tests to examine conceptual associations, including the Word Embedding Association Test (Caliskan et al., 2017) and the Relative Norm Distance (Garg et al., 2018), with permutation test of significance.
  4. A set of training methods to locally train (static) word vectors from text corpora, including Word2Vec (Mikolov et al., 2013), GloVe (Pennington et al., 2014), and FastText (Bojanowski et al., 2017).

⚠️ All users should update the package to version ≥ 0.3.2. Old versions may have slow processing speed and other problems.

Author

Han-Wu-Shuang (Bruce) Bao 包寒吴霜

Copy Link

Version

Install

install.packages('PsychWordVec')

Monthly Downloads

425

Version

2025.3

License

GPL-3

Maintainer

Han-Wu-Shuang Bao

Last Published

March 30th, 2025

Functions in PsychWordVec (2025.3)

plot_wordvec_tSNE

Visualize word vectors with dimensionality reduced using t-SNE.
normalize

Normalize all word vectors to the unit length 1.
plot_similarity

Visualize cosine similarity of word pairs.
tab_similarity

Tabulate cosine similarity/distance of word pairs.
reexports

Objects exported from other packages
plot_wordvec

Visualize word vectors.
sum_wordvec

Calculate the sum vector of multiple words.
pair_similarity

Compute a matrix of cosine similarity/distance of word pairs.
orth_procrustes

Orthogonal Procrustes rotation for matrix alignment.
plot_network

Visualize a (partial correlation) network graph of words.
tokenize

Tokenize raw text for training word embeddings.
train_wordvec

Train static word embeddings using the Word2Vec, GloVe, or FastText algorithm.
test_RND

Relative Norm Distance (RND) analysis.
test_WEAT

Word Embedding Association Test (WEAT) and Single-Category WEAT.
dict_expand

Expand a dictionary from the most similar words.
as_embed

Word vectors data class: wordvec and embed.
data_wordvec_subset

Extract a subset of word vectors data (with S3 methods).
cosine_similarity

Cosine similarity/distance between two vectors.
most_similar

Find the Top-N most similar words.
data_wordvec_load

Load word vectors data (wordvec or embed) from ".RData" file.
data_transform

Transform plain text of word vectors into wordvec (data.table) or embed (matrix), saved in a compressed ".RData" file.
demodata

Demo data (pre-trained using word2vec on Google News; 8000 vocab, 300 dims).
dict_reliability

Reliability analysis and PCA of a dictionary.
get_wordvec

Extract word vector(s).