A function to find (semi)-distinct words in a list of term vectors.
distinct_words(word_vector_list, threshold = 1)
A list of character vectors we wish to find distinctive words in.
An integer > 0 indicating the number of times a word must appear more than to be included in the vector we return. Defaults to threshold = 1, meaning all words that appear 1 or less times in the other term vectors we pass in will be removed from them before they are compared against the current vector. In this way we can get pseudo-distinct words, perhaps preventing us from removing really distinctive words that appear only threshold or less times in most term vectors, but lots of times in one vector in particular.
A list of distinct word vectors.