[MUSIC] Okay, so that's one way to retrieve a document of interest. Just take all articles out there, scan over them, and find the one that's most similar according to the metric that we define. But another thing we might be interested in doing is clustering documents that are related, so for example. You might have a whole bunch of articles about sports or world news or different things like this, and if we can structure our corpus in this way, if a person is reading an article that's about sports, then we can very quickly search over all the other articles about sports, instead of looking at every article that's out there in the entire corpus. But the challenge here is the fact that these articles aren't going to have labels. It's not gonna be like the New York Times, where you go and somebody says, this is an education article. Okay, and all we have are articles, and what we like to do, or discover these underlying groups of articles. Okay, so the goal is to discover these groups or clusters of related articles, and like I mentioned, one might represent a set of articles like sports, and another one a set of articles about world news. For the time being, let's actually assume that someone provides us with labels, so, somebody goes through and reads every article or at least a large sub-set of articles in our corpus and labels them and says, okay, these articles, these are all about sports. And these ones, these are about world news. And these about entertainment. And these ones are about science. So we have some set of articles that have labels associated with them. Well, in this case, when we have our query article and we like to sign it to a cluster, this ends up being just a multi-class classification problem. Because the question is just, here is my query article, I don't know the label associated with it, and I have a bunch of label document. I have the labels world news, science, sports, entertainment, technology, and I just wanna classify which class this article belongs to, okay? So this is the question. It's just a multi class classification problem. So if that were the case, that would be an example of a supervise learning problem. [MUSIC]