1 00:00:00,648 --> 00:00:03,804 [MUSIC] 2 00:00:03,804 --> 00:00:07,844 So let's start with describing this task in a little bit more detail. 3 00:00:07,844 --> 00:00:11,861 Okay, so let's say we're sitting here, reading an article. 4 00:00:15,019 --> 00:00:17,202 Here we are, reading our article. 5 00:00:17,202 --> 00:00:19,811 And actually it's Carlos who's reading this article, 6 00:00:19,811 --> 00:00:21,720 because this is an article about soccer. 7 00:00:23,870 --> 00:00:28,670 Or as he would call it football if he was wearing his Argentina jersey 8 00:00:29,670 --> 00:00:35,470 or footsie ball if he was wearing his Brazil jersey and 9 00:00:35,470 --> 00:00:37,925 clearly, I'm not pronouncing either of those words correctly. 10 00:00:37,925 --> 00:00:39,970 Carlos can correct me later. 11 00:00:39,970 --> 00:00:43,365 But the point is that he likes this article and what we'd like 12 00:00:43,365 --> 00:00:47,505 to do is retrieve another article that he might be interested in reading. 13 00:00:49,363 --> 00:00:50,746 But a question is how do we do this? 14 00:00:50,746 --> 00:00:54,123 There are lots and lots of articles out there and I can't expect him to go and 15 00:00:54,123 --> 00:00:57,890 read each of them and say yes, he's interested or no, he's not. 16 00:00:57,890 --> 00:01:01,913 So we like to think of a way to automatically retrieve a document that 17 00:01:01,913 --> 00:01:03,619 might be of interest to him. 18 00:01:03,619 --> 00:01:07,612 By questions here are first, how to we measure similarity between articles? 19 00:01:07,612 --> 00:01:11,851 We need to have that in order to say that this article is similar to the one he's 20 00:01:11,851 --> 00:01:15,040 reading now and might also be of interest to him. 21 00:01:15,040 --> 00:01:18,010 Or that, here's a large set of articles that are very different and 22 00:01:18,010 --> 00:01:20,490 probably are not of interest to him. 23 00:01:20,490 --> 00:01:24,800 And then the second question is, how are we gonna search over the articles that 24 00:01:24,800 --> 00:01:28,146 exist out there and retrieve the next article to recommend. 25 00:01:28,146 --> 00:01:31,699 [MUSIC]