a.k.a. Word-Word matrices or Co-occurrence vectors

Process

  1. Requires a large volume of data
  2. basic preprocessing steps: Tokenization, Lemmatization, etc
  3. count number of times word u appears with word v
  4. meaning of a word u is the vector of counts (named word vector)
    • meaning(u) = [count(u,v1), count(u,v2), …]

We get,

  • A matrix X n × m where n = |V| (target words) and m = |Vc | (context words)
    • usually a square matrix
  • context window of ±k words (to the left & right)

Pros

  • compute similarities between words using cosine
  • visualize words
  • dimensions are meaningful, Explainable AI

Cons

Weighing scheme

Distance Discount