Many problems involve finding a “similar set”:
e.g. Near Neighbors
e.g. Scene Completion
Given an incomplete image (i.e. portion is missing), can we “repair” it with similar images?
Want to find images with similar features
e.g. Pages with similar words
e.g. Customers who purchased similar products
Given high-dimensional datapoints $x_1, x_2, ..., x_n$ and some distance function $d(x_1, x_2)$:
Recall that we can find identical items in a collection of $n$ via hash table
Locality Sensitive Hashing (LSH)
Suppose that for objects $X, X'$, we have that $d(X, X') = c$.
Given $X, h(X)$: