[MUSIC] KD-trees are one approach for
efficient nearest-neighbor search. But as we mentioned, there are other
approaches that we could consider. So let's look at one of these
called locality sensitive hashing. So why might we want to consider
an approach other than KD-trees? Well, KD-trees are really cool. They're a very intuitive way to
think about storing data, and as we saw,
they could lead to some efficiencies. And actually some significant efficiencies
in performing nearest neighbor search. But there are few issues. One is the fact that KD-trees are not
the simplest things to implement. You have to structure out this tree, and
it can be pretty challenging to do that. The other is something that we saw that,
as the dimensionality of our data increases, we can actually get really,
really poor performance in terms of the efficiency of nearest
neighbor search using KD-trees. So let's talk about
this a little bit more. So what happens in high
dimensions is basically, every point is far away
from any other point. So this is something
that we mentioned before. So if we think about taking our
KD-tree partitioning, remember, in a high dimensional space, we're going to think of these hyper cubes
that form the different partitions. Then if we have our query point, it's
unlikely there's going to be any point within the given hypercube or
even close by. So we're going to be searching around,
finding some point, and that will be considered our nearest neighbor so
far in the algorithm that we specified. And remember, what we do is we draw
a radius to that nearest neighbor so ,far and we think of, now in high dimensions,
this big sphere, this hyper-sphere. And we can only prune bins if
the distance to that bin is greater than the distance
to nearest neighbor so far. But what happens in these high
dimensions is that this big sphere, this radius to our nearest neighbor so
far, intersects lots and lots of these hypercubes
in at least one dimension. And so we can't prune them, and what ends
up happening is you have to search many, many, many, many partitions. And you can show under some conditions
that you actually need to search over 2 to the d different hypercubes or
little partitions, which completely defeats the purpose
of the efficiencies gained by KD-trees. Unless n, the number of observations,
is just enormously large. Much, much greater than 2 to the d, okay. So what can we do? Well, one thing we can think about doing
is moving away from performing exact nearest neighbor search. And this is something that we discussed
while we talked about KD-trees. We showed how we could use kd-trees
to perform approximate nearest neighbor search. And that's going to be
the focus of this section. because we're going to assume that
we don't fully, fully trust our distance metric as being exactly the way
that we want to characterize distances. And often in a lot of applications, you just want to find a neighbor that's
pretty close to the given query point. It doesn't necessarily have to be exactly,
exactly, exactly the nearest neighbor. And so for a lot of retrieval tasks, that will actually provide enough
performance for the given task. But here, in contrast to what we did for
KD-trees for approximate nearest neighbor search, we're
going to look at methods where we can provide a probabilistic guarantee on
the approximation that we're making. [MUSIC]