Now let's see how we could create a text index for a bunch of documents. Such as the three documents we had earlier. We go through each document one by one, reading each word in turn, when we read the word the, we insert an element against the in the final structure that we want to create. Similarly to read the word quick we enter A.com as the id of this document against quick. Similarly for brown, similarly for over and for other words in this document. Moving on to the next document, we do the same thing. Enter word with B.com this time instead of A.com because now we are processing the second document. Singularly for the third document. But this time since the is already present, in the index, we merely append the entry c dot com after a dot com against the, rather than create a new entry. Similarly for lazy. Bird, and finally for worm we enter a new entry with just c dot com against it. How would we write a program to do this? Now, we'll go a little bit slowly while describing this program because I'm not assuming everybody is familiar with programming Python or object-oriented programming, or even if you are I'm assuming that you need a bit of refreshment. So we'll create a class index which is going to be a data structure index that we, we are gonna create which will have a function which will create, which given a list of documents D. This function will essentially populate our index in this manner. So what we'll do is we'll go through each document in D. Now in Python what this means is that if D is a list, for D in D will essentially iterate over each document in D. One by one. And for each such document small d will go through each word w in that document. Once again we are assuming the document small, the document small d in this list of documents d is itself a list of words this time. So for each of these words, we'll go through the document small d. And you've to call another function, look up, also on the same beta structure index that we are creating. And look up and check if this word W is already in the index. Moreover if the word W is in the index then we assume that the look up function will return I, if you'll give us the position in the list. Where W is actually present, so we can access W directly. For example, if we're looking at the word "quick", then it will return I = zero, one, two, three, four, five, six, so that we can directly access this particular part of the index structure. If W is not in the index structure, then we assume that this function look up returns a negative number. If that is the case then we have to add W to the index, fresh just like we added worm in the end. And in that case the, the function add which does this will return the position where W was added, so for example here, it would return the position eight. Once its, once we have the position eight, we can append to this List against worm for example the ID of the document where the word "W" was found which is "D" dot ID. Not is that we don't append "D" directly because "D" itself is a list of words. So we don't want to append the entire document here we want to append only the name of the document of the ID of the web page or URL of the document Going back, there's an [inaudible] part, of course. If we did find W in the first place, we would simply append it. Append the ID at that position I, without having to ap, add a fresh entry into the index. So that's simple, so this program, if we, if we're able to write all this functions, look-up, add, append. Then we can, essentially go through this list of documents, where each document is a list of words and create this index structure. Very simply. We'll probably do some such assignment later on in the course. But at that time we will do it on many machines using parallel computing, the way it's actually done in Google and other large search engines.