• unsupervised Topic Modelling algorithm
  • Generative statistical model
  • Assumptions
    • each document is a mixture of topics
    • each word is attributable to at least one of the topics
  • number of topics k to be pre-defined
  • Output
    • list of words (ranked by probability) associated with a topic of interest
    • probability that a word of interest is associated with a topic
    • proportion of a document that pertains to a topic
    • does not output the topic itself