[MUSIC] Let's now discuss how KD-trees can aid in efficiently performing nearest neighbor search. So to start with, we're going to have our query point showing with this green circle here. So this is our query point. And then the first thing we're going to do is we're going to traverse the tree all the way down to the leaf node that contains that query point. What that means is, we're going to check all the conditions of this point against the conditions at each one of these internal nodes until we get to a leaf node. Okay, so to begin with, we're going to check the first condition, remember it was a split on x1, whether or not it was greater than or less than 0.5. In this case, x1, for this point, is greater than 0.5. So, we have x1 for query point, x cubed is greater than 0.5. So we say, yes, and we traverse down this side of the tree. Then we check the next condition, and this was a condition of whether x2 was greater than some value. I'll just call it v. So let's call this point here, v. And here we see that, yes, x2 is greater than v. So, again, we have a yes. And we traverse down this way. And we keep doing that. And then all of the sudden we get to a leaf note. Okay, now once we're at this leaf node, the first thing we do is start searching for nearest neighbors. because what our hope is is that within this box, local to our point, that might be a likely place to find the nearest neighbor. Okay, so we compute the distances to each point contained in this box. So this node has a list of three points, the query point and these two other points here. And we compute those distances. So, that's what's said right here on this slide. Now that we've computed these distances, what we do is we record the distance to the nearest neighbor found so far. So, let's call this r. And this is distance to nearest neighbor found so far. Okay, then an actual question is, is that it? Are we done? Does the nearest neighbor live within the blue box that we just searched? No, not necessarily, right? That should be obvious because now that we've drawn this fuchsia circle, we see that there are other points that are closer to this green point that live in other neighboring boxes. So what we're going to do then is we're going to continue searching. So we're going to go from this node that we already examined. So that one's already done. And we're going to backtrack, we're going to go back up the tree and start checking out other branches. Okay, so then we're going to go down to a leaf node. So we checked this one. And now we've traversed here to this leaf. And now we're going to compute distances to all of these points and compare relative to the distance to our nearest neighbor found so far. And in this case, we have that distance to nearest neighbor is smaller Than what we had before. So this becomes our new nearest neighbor. And this becomes our new distance to nearest neighbor. And then we continue searching. So, Let me just write this. This becomes our new distance to nearest neighbor. And when we continue searching, we've now checked this node. We've checked this node. And we're going to go and check this other node. But what this blue box here shows is that's the type bounding box for that node. Remember, it's the box that contains all the points within that node. And what we see is that the distance to that box is greater than the distance to our nearest neighbor found so far. So, Distance to bounding box is greater than distance to nearest neighbor so far. And so what this implies, is that we're going to prune this. We're not even going to go and compute the distance to any point in this leaf node here. Because we know that no point in there can be a nearest neighbor closer than the nearest neighbor that we found so far. Okay, now what we've done. We've checked this node. We've checked this node. We've pruned this node. We go up and we start going down this other branch. What happens here? Again, the distance to this box exceeds the distance to our nearest neighbor, so we prune. Then what we're going to do, is we're going to go up and start searching the other branch, but all the other boxes that we have exceed the distance to our nearest neighbor. So, what we see is that we've been able to prune quite a lot of the search space. Let's just go back and look at all these points. In the brute force method, we would have had to compute distances to all those points. But now we only had to compute distance to a very few number of data points in this example to find our nearest neighbor. So, this is where KD-trees are so useful in performing efficient nearest neighbor search. When you have lots and lots of observations, especially if you're going to be performing lots and lots of queries. Because you have to build the tree. And there's some computational cost to building this KD-tree. But if you're going to be using it again and again, it can actually provide a significant savings. [MUSIC]