Let's return to technology now and ask about search in a different, private context, such as searching one's own desktop, one's own email, and other private data sets. Clearly indexing the way we described it in the very beginning of this course will work fine. But what about relevance. Normally we don't have links between different documents on our Desktop or hyper-linked emails, so we can't directly [inaudible] track. And we need to use other associations. For example, we need to link documents that talk about the same people of the same places or we might use relevance feedback by tracking our own behavior to see which documents we actually use in response to a bunch of source results, very similar to our page rank is being improved by our own use of search everyday. But there are even more problems with private data. Most of the time, each document has multiply versions, or different formant for the same documents like power point and pdf's. And, many versions of the same document as it undergoes editing. So detecting duplicates and handling them appropriately is very important. Lastly, is search the only paradigm for finding stuff? And this take us to. Areas such as, topic mining. Activity mining, and contextual suggestions. We'll return to some of these advanced topics very soon. But before that let's, make things even more difficult. And talk about. Data bases which are used in large enterprises. And enterprise search. Using such databases, as well as. A lot of unstructured, textual data. Enterprise search poses all the challenges of private search that we discussed on the previous chart. And more. For example, the results of a search could depend on the context in which somebody is forming that search. And people play multiple roles in an organization. Sometimes, I'm acting as a researcher. Sometimes as a teacher. Sometimes as an executive and so on. Next. How do you classify. Large sets of documents? Each one of us. Faces challenges classifying our own documents. On our desktops. The problem becomes even more. Complicated when you have to classify documents. Used by 100's or 1000's of people. What kind of classification works? Should it be manually done, by a central team? Or can it be done automatically? Can you have many different classifications depending on. How you want to view. A whole bunch of documents? What about security? Not everybody's allowed to access every document, or every piece of data in an organization. Some things are secret, and some highly secret. And lastly, what about structured data? The kind that's found in databases. Unfortunately, sequel is not the answer. For example, text inside structured records is not easily searched using sequel, as we'll explain shortly. Next, linking unstructured documents to structured documents is also important, and not possible easily. Finally just searching structured records and getting a list of related records grouped together as objects is a huge challenge, which is simply not been satisfactory resolved yet and that's what we'll talk about in our next example.