When we pick up a newspaper, we are looking for the maximum information content. So, naturally, more surprising events make for better news. And in passing, as you read the news, you and I glance at some advertisements, and the paper makes some money. Of course, the questions the newspaper needs to ask is, when to place an ad, and where to place an ad? Which newspapers against which stories? Unfortunately print newspapers just don't have that freedom. Sometimes the interesting news is on the sports page, whereas your most expensive ad is right up front. To understand how better advertising model, in particular online advertising, is related to information theory, let's first see what Shannon did with his concept of information being related to surprise. Shannon was concerned with communication along a noisy channel, like a telephone line or the radio waves on the air. Today he'd be concerned about communicating on the internet. But of course the internet wasn't there in his times. The model that he used was the transmitted signal at one end of the channel. It could be a sequence of messages or, today, a sequence of bits received on the other side as another sequence of messages. There's an important concept that he defined called the mutual information between the transmitted signal and the received signal. And the purpose of the channel, these days the cell phone network or the internet is to maximize the mutual information between what is transmitted and what is received. So obviously, you really want to hear what your friend is saying at the other end of the line. Now let's model the advertising problem in the language of communication along a noisy channel. The advertising context we approach, the newspaper or webpages with some intent. And some degree of attention and interest. And the advertiser would like to measure which pages we are interested in and when we have an intention to buy. Unfortunately for normal print or media advertising, the only signal available to an advertiser is the actual sales transactions at the end of a quarter or the ad revenue that one spends and after the fact, correlation of the two is all that one is really able to do. This is highly erroneous and so the mutual information between our actual intent and attention and the lag measures, which come much later of sales transactions or advertising revenue incurred is very low. In other words, it's very difficult to predict which paper to advertise in. And on which page do advertising simply based on this signals. The online world of course changes everything. Because the signals we have are the clicks made by users, the queries made when they search, and the actual content that they are reading. The mutual information therefore is much higher. Let's see how. A brief overview of online advertising first. In the case of search advertising advertisers bid for keywords in Google's online keyword option, which is happening continuously all the time. The highest bidders ads are place against the matching searches, so if y bid for shoes and you searched for shoes and mine was the highest bid, then my ad would come up against your query and so on. The actual auction mechanism is a bit more complex than that, but we won't go into that right now. The result of this advertising model is that the mutual information between the advertising dollars, spent, and the sales that they translate into is improved considerably from traditional media, and is fundamentally responsible for the success of online advertising.