Given a set of points with a notion of distance between the points, we want to group the points into some number of clusters such that:

Typically:

image.png

Note that clustering is an unsupervised ML method:

Clustering is a hard problem due to dimensionality (”the curse of dimensionality”).

We must quantify the distance between objects.