1 00:00:00,078 --> 00:00:04,444 [MUSIC] 2 00:00:04,444 --> 00:00:07,334 There are many systems for restaurant reviews out there but 3 00:00:07,334 --> 00:00:11,118 today we're gonna talk about a pretty new and exciting one that we can build. 4 00:00:11,118 --> 00:00:15,960 So it's an important day for 5 00:00:15,960 --> 00:00:21,100 us and we're trying to book a Japanese restaurant, and 6 00:00:21,100 --> 00:00:26,120 we love Japanese food and want a really cool place to have some Japanese food. 7 00:00:26,120 --> 00:00:29,670 But this is Seattle that we live in, and Seattle is really awesome for 8 00:00:29,670 --> 00:00:33,526 Japanese food, so there are many, many places for sushi, and in fact, 9 00:00:33,526 --> 00:00:38,500 there are many that have really good ratings, say four stars. 10 00:00:38,500 --> 00:00:42,620 So perhaps when I'm thinking about a restaurant to go to 11 00:00:42,620 --> 00:00:46,150 I'm not just thinking about the overall stars of the restaurant, but 12 00:00:46,150 --> 00:00:50,440 I'm thinking about how is it in terms of the food, in terms of the ambiance? 13 00:00:50,440 --> 00:00:53,090 In particular how is it in terms of the sushi, 14 00:00:53,090 --> 00:00:54,580 which is the thing that I really care about. 15 00:00:54,580 --> 00:00:58,863 I really love sushi and I want really, really the best, freshest fish, 16 00:00:58,863 --> 00:01:01,290 and the most awesome, innovative sushi I can get. 17 00:01:03,370 --> 00:01:07,360 So when you talk about a positive review for a restaurant, and here's a restaurant 18 00:01:07,360 --> 00:01:12,670 that we love in Seattle, a sample review might go something like this. 19 00:01:14,640 --> 00:01:17,600 So watching the chefs create an incredible, 20 00:01:17,600 --> 00:01:20,810 edible piece of art, was really a unique experience. 21 00:01:20,810 --> 00:01:26,060 So the experience is really positive if we think about that aspect of the restaurant. 22 00:01:26,060 --> 00:01:30,040 But a review might want to say, well, my wife tried the ramen, and 23 00:01:30,040 --> 00:01:31,810 it was pretty forgettable. 24 00:01:31,810 --> 00:01:33,990 That means the ramen, not so good. 25 00:01:35,560 --> 00:01:38,740 Now, the review continues and says, 26 00:01:38,740 --> 00:01:43,760 all the sushi was delicious, easily the best sushi in Seattle. 27 00:01:43,760 --> 00:01:47,960 So the sushi was really two thumbs up. 28 00:01:47,960 --> 00:01:49,780 Now, I don't care about the ramen. 29 00:01:49,780 --> 00:01:51,410 I'm not going to this place for ramen. 30 00:01:51,410 --> 00:01:53,460 I don't care that this person really didn't like the ramen, 31 00:01:53,460 --> 00:01:55,120 or his wife really didn't like the ramen. 32 00:01:55,120 --> 00:01:59,040 What I really care about is good experience and amazing sushi. 33 00:01:59,040 --> 00:02:00,270 So this review is just for me. 34 00:02:01,780 --> 00:02:04,080 So when I'm thinking about restaurant reviews, 35 00:02:04,080 --> 00:02:08,100 I wanna understand the aspects of the restaurant, their positive or 36 00:02:08,100 --> 00:02:13,270 negative, and really think about which ones of those really affect my interests. 37 00:02:13,270 --> 00:02:18,010 So if I look at all the restaurant reviews, I want to feed them to this new, 38 00:02:18,010 --> 00:02:22,440 really exciting, new type of restaurant recommendation system that's gonna tell me 39 00:02:22,440 --> 00:02:25,730 the experience was good, four stars. 40 00:02:25,730 --> 00:02:29,790 The ramen was a so-so, but who cares? 41 00:02:29,790 --> 00:02:34,560 But the sushi, the thing that I really care about, that was five stars, and 42 00:02:34,560 --> 00:02:37,860 not just that, it's gonna give me some interesting feedback, so for 43 00:02:37,860 --> 00:02:42,530 example, this is easily the best sushi in Seattle. 44 00:02:42,530 --> 00:02:43,440 Now that's where I wanna go. 45 00:02:44,790 --> 00:02:48,890 So how do we build such an intelligent restaurant review system? 46 00:02:48,890 --> 00:02:51,280 Well we're going to start with all the reviews and 47 00:02:51,280 --> 00:02:53,580 we're going to break them up into sentences. 48 00:02:53,580 --> 00:02:56,600 So each review is composed of multiple sentences and 49 00:02:56,600 --> 00:03:00,510 some sentences cover different aspects of the restaurant. 50 00:03:00,510 --> 00:03:05,240 So for example, a sentence that says easily the best sushi in Seattle, 51 00:03:05,240 --> 00:03:08,520 I wanna feed that sentence through a sentiment classifier. 52 00:03:08,520 --> 00:03:12,286 So sentiment classifier is gonna say is this sentence positive, or 53 00:03:12,286 --> 00:03:13,814 is it a negative sentence? 54 00:03:15,717 --> 00:03:19,309 Now we have a sentiment classifier, which can take a sentence and 55 00:03:19,309 --> 00:03:21,450 say is it positive or negative? 56 00:03:21,450 --> 00:03:25,088 So how do we build this new kind of cool restaurant review experience? 57 00:03:25,088 --> 00:03:27,600 So we're gonna take all the reviews and 58 00:03:27,600 --> 00:03:29,980 break them into sentences that will be discussed but 59 00:03:29,980 --> 00:03:33,970 then we're gonna select the sentences that talk about the aspect I care about. 60 00:03:33,970 --> 00:03:36,600 So which ones of these are really about sushi? 61 00:03:36,600 --> 00:03:38,538 So not all of them, but a subset of those. 62 00:03:38,538 --> 00:03:42,610 Then I feed those sentences through the sentiment classifier, 63 00:03:42,610 --> 00:03:46,250 each one of them, and then I average the results. 64 00:03:46,250 --> 00:03:50,330 In this case all of the sentences were positive and 65 00:03:50,330 --> 00:03:56,390 so I'm gonna say this is a five star restaurant review in terms of the sushi. 66 00:03:56,390 --> 00:04:00,720 Not just that, I can look at the sentences about sushi and 67 00:04:00,720 --> 00:04:02,509 look at which ones are most positive. 68 00:04:03,570 --> 00:04:08,440 So for example, this one must be the positive, but easily the best 69 00:04:08,440 --> 00:04:12,820 restaurant in Seattle maybe is the most positive, and I can display those. 70 00:04:12,820 --> 00:04:15,470 I can also display the most negative ones. 71 00:04:15,470 --> 00:04:19,488 In this case, there's nothing negative to say about the sushi. 72 00:04:19,488 --> 00:04:23,239 [MUSIC]