1 00:00:00,000 --> 00:00:06,009 When we pick up a newspaper, we are looking for the maximum information 2 00:00:06,009 --> 00:00:10,031 content. So, naturally, more surprising events make 3 00:00:10,031 --> 00:00:15,012 for better news. And in passing, as you read the news, you 4 00:00:15,012 --> 00:00:20,087 and I glance at some advertisements, and the paper makes some money. 5 00:00:21,062 --> 00:00:29,058 Of course, the questions the newspaper needs to ask is, when to place an ad, and 6 00:00:29,058 --> 00:00:35,079 where to place an ad? Which newspapers against which stories? 7 00:00:35,079 --> 00:00:41,850 Unfortunately print newspapers just don't have that freedom. 8 00:00:41,850 --> 00:00:49,332 Sometimes the interesting news is on the sports page, whereas your most expensive 9 00:00:49,332 --> 00:00:54,496 ad is right up front. To understand how better advertising 10 00:00:54,496 --> 00:01:02,210 model, in particular online advertising, is related to information theory, let's 11 00:01:02,210 --> 00:01:09,323 first see what Shannon did with his concept of information being related to 12 00:01:09,323 --> 00:01:14,671 surprise. Shannon was concerned with communication 13 00:01:14,671 --> 00:01:22,287 along a noisy channel, like a telephone line or the radio waves on the air. 14 00:01:22,287 --> 00:01:27,530 Today he'd be concerned about communicating on the internet. 15 00:01:27,530 --> 00:01:32,358 But of course the internet wasn't there in his times. 16 00:01:32,358 --> 00:01:41,323 The model that he used was the transmitted signal at one end of the channel. 17 00:01:41,323 --> 00:01:49,645 It could be a sequence of messages or, today, a sequence of bits received on the 18 00:01:49,645 --> 00:01:54,959 other side as another sequence of messages. 19 00:01:54,959 --> 00:02:02,993 There's an important concept that he defined called the mutual information 20 00:02:02,993 --> 00:02:07,427 between the transmitted signal and the received signal. 21 00:02:07,427 --> 00:02:14,497 And the purpose of the channel, these days the cell phone network or the internet is 22 00:02:14,497 --> 00:02:21,494 to maximize the mutual information between what is transmitted and what is received. 23 00:02:21,494 --> 00:02:28,345 So obviously, you really want to hear what your friend is saying at the other end of 24 00:02:28,345 --> 00:02:34,316 the line. Now let's model the advertising problem in 25 00:02:34,316 --> 00:02:39,899 the language of communication along a noisy channel. 26 00:02:39,899 --> 00:02:49,093 The advertising context we approach, the newspaper or webpages with some intent. 27 00:02:49,093 --> 00:02:58,050 And some degree of attention and interest. And the advertiser would like to measure 28 00:02:59,006 --> 00:03:05,036 which pages we are interested in and when we have an intention to buy. 29 00:03:05,074 --> 00:03:14,039 Unfortunately for normal print or media advertising, the only signal available to 30 00:03:14,039 --> 00:03:22,030 an advertiser is the actual sales transactions at the end of a quarter or 31 00:03:22,030 --> 00:03:30,736 the ad revenue that one spends and after the fact, correlation of the two is all 32 00:03:30,736 --> 00:03:42,180 that one is really able to do. This is highly erroneous and so the mutual 33 00:03:42,180 --> 00:03:51,129 information between our actual intent and attention and the lag measures, which come 34 00:03:51,129 --> 00:04:00,487 much later of sales transactions or advertising revenue incurred is very low. 35 00:04:00,487 --> 00:04:09,292 In other words, it's very difficult to predict which paper to advertise in. 36 00:04:09,292 --> 00:04:16,336 And on which page do advertising simply based on this signals. 37 00:04:16,336 --> 00:04:20,970 The online world of course changes everything. 38 00:04:20,970 --> 00:04:27,348 Because the signals we have are the clicks made by users, the queries made when they 39 00:04:27,348 --> 00:04:31,887 search, and the actual content that they are reading. 40 00:04:31,887 --> 00:04:36,745 The mutual information therefore is much higher. 41 00:04:36,745 --> 00:04:43,517 Let's see how. A brief overview of online advertising 42 00:04:43,517 --> 00:04:48,219 first. In the case of search advertising 43 00:04:48,219 --> 00:04:55,378 advertisers bid for keywords in Google's online keyword option, which is happening 44 00:04:55,378 --> 00:05:02,021 continuously all the time. The highest bidders ads are place against 45 00:05:02,021 --> 00:05:08,076 the matching searches, so if y bid for shoes and you searched for shoes and mine 46 00:05:08,076 --> 00:05:14,082 was the highest bid, then my ad would come up against your query and so on. 47 00:05:14,082 --> 00:05:21,053 The actual auction mechanism is a bit more complex than that, but we won't go into 48 00:05:21,053 --> 00:05:27,090 that right now. The result of this advertising model is 49 00:05:27,090 --> 00:05:35,047 that the mutual information between the advertising dollars, spent, and the sales 50 00:05:35,047 --> 00:05:42,075 that they translate into is improved considerably from traditional media, and 51 00:05:42,075 --> 00:05:49,010 is fundamentally responsible for the success of online advertising.