• Clusters are the hills in the normalized density function of distribution of the feature space.
    • The distribution of the feature space can have varying densities → more points in a space makes it more dense
    • normalized density function → returns the density
  • peak or mode of the hill is the cluster center.
  • hill climbing function → every point is assigned to the hill peak it climbs to.

Algorithm

For some n-dimensional feature space

  • set the mean of every point as the point itself.
  • For every point, compute a mean within a window of size w
    • simple mean or weighted mean → closer the points are to the sample point, higher the weight
  • shift the window to the new mean
  • repeat till convergence
    • mean doesn’t change more than some threshold
    • convergence → hill peak or mode
  • points that converge to the same mode are in the same cluster

Pros

  • number of clusters do not need to be specified
  • robust to outliers (they form their own tiny clusters)

Cons

  • computationally expensive (mean for each point)
  • sensitive to window size w