Okay, let's step back and reflect on what we've learned so far. We've got a basic idea of how documents are stored on the web and how they can be retrieved using text indexes. But, we still haven't really understood how big the problem is. We don't know, for example, how many documents there are on the web. We don't know how to best order the documents when they are returned from a search. And we don't really understand what any of this has to do with intelligence, memory, and the kind of things that we're trying to understand about predictive intelligence. But bear with me for a while, and we'll get to these subjects very soon. So, now that we know what an index is, and how to create it, how many web pages do you think are actually indexed by a search engine like Google or Bing for, for that matter? Is it two to five billion? Is it 30 to 40 billion? 200 to 300 billion or in the trillions? Now, the number of possible URL's of web pages are probably in the trillions. So, I mean, that's not really the answer. So, we can rule one out immediately. Well, what about the rest? Remember, this is, you're not asking how many web pages there are. How many webpages are indexed by a search engine. Think about it. You can actually find this out by doing a few experiments with your browser. So how did you do? The correct answer according to me is between 30 and 40 billion. And this is how I reason. If you search for a common word such as, a, in, the, or, on Google, you get around 20 to 25 billion results. Assuming that, that not all pages are English, my guess is between 30 and 40 billion. Now, let's return to our problem of arranging the results of a search query. As we have seen, the web has a lot of pages, billions and billions. What if the result set is very large. For example, when we searched for a or the in Google, we got 20 to 25 billion results. If you have q-terms in your query, such as the quick brown fox , you'll still get many hundreds of thousands of results and assembling these results in the way we had discussed earlier in terms of which results match best is still pretty costly if the number of results is very large. Then, another problem. Suppose we search for something like Clinton plays India cards? Well, when I did this a little while back I got results of two types. One about Hillary Clinton visiting India but Islamabad was not on the cards or more recently Mitt Romney playing the Hllary Clinton card or something like that, about American politics. And the another set of results about a company called Clinton Cards which was acquired, which shut down, which closed shop, and, and anything to do with that topic. So, similarity from search index is quite different from importance of a webpage. Clinton cards India may actually have many, many results, which talk about the second topic. But clearly the first topic is more popular and more important. How does Google figure this out?