1 00:00:00,383 --> 00:00:04,463 [MUSIC] 2 00:00:04,463 --> 00:00:05,358 Now, so far, 3 00:00:05,358 --> 00:00:11,030 we've really just talked about how do you build a machine learning model using data? 4 00:00:11,030 --> 00:00:12,850 How do you measure its quality? 5 00:00:12,850 --> 00:00:14,950 And how do you understand whether it's working or not? 6 00:00:14,950 --> 00:00:17,140 But that's not enough to build an intelligent application. 7 00:00:17,140 --> 00:00:20,704 When you build an intelligent application, you want to take all that you learn and 8 00:00:20,704 --> 00:00:22,690 then really put it in front of, 9 00:00:22,690 --> 00:00:26,960 say the customers of your company who are going to your website to buy a product. 10 00:00:26,960 --> 00:00:29,130 So what does the process actually look like? 11 00:00:29,130 --> 00:00:32,285 There are many different aspects associated with it. 12 00:00:32,285 --> 00:00:35,505 This is what we call model deployment, where we're taking the model and 13 00:00:35,505 --> 00:00:38,605 we allow it to serve predictions in real time. 14 00:00:38,605 --> 00:00:40,085 We'll talk more about that. 15 00:00:40,085 --> 00:00:44,782 Then we'll talk about how do we evaluate that model to make sure what we did and 16 00:00:44,782 --> 00:00:49,222 offline as we train the model is still good in the long run as we're using it. 17 00:00:49,222 --> 00:00:52,302 Then we have to think about all the management pieces. 18 00:00:52,302 --> 00:00:56,002 How do make sure the model is still good, how do we replace it when the model 19 00:00:56,002 --> 00:00:59,146 improves, and how do we react to measurements that we make, 20 00:00:59,146 --> 00:01:03,056 which are actually the metrics that we monitor over time to try to understand 21 00:01:03,056 --> 00:01:06,556 whether our models are still good or it's time to relearn them from data. 22 00:01:06,556 --> 00:01:09,326 Now all these pieces are really coming together. 23 00:01:09,326 --> 00:01:14,216 They're not just individual, separated ones, but interacting pieces that we need 24 00:01:14,216 --> 00:01:18,796 to understand deeply and create a cycle of improvement of the models that we have. 25 00:01:18,796 --> 00:01:19,976 So let's start with the first one. 26 00:01:19,976 --> 00:01:22,210 Let's start with deployment piece. 27 00:01:22,210 --> 00:01:24,030 Now consider the following setup. 28 00:01:24,030 --> 00:01:27,600 You're building a really cool new kind of product recommending system, and 29 00:01:27,600 --> 00:01:31,810 you're using a bunch of data with millions of product reviews and users. 30 00:01:31,810 --> 00:01:35,915 And you want to take the model that you learned on your laptop, in your desktop or 31 00:01:35,915 --> 00:01:37,785 in the big cluster of machines. 32 00:01:37,785 --> 00:01:41,585 And deploy it on the website that really interacts with your users. 33 00:01:41,585 --> 00:01:42,965 What does that look like? 34 00:01:42,965 --> 00:01:46,233 Well that whole system starts with some historical data. 35 00:01:46,233 --> 00:01:50,035 Some data that you've collected about the users of your system, 36 00:01:50,035 --> 00:01:52,205 the reviews they've written, what preferences they have. 37 00:01:52,205 --> 00:01:57,010 And we take that data and we feed it to train a model like we saw in 38 00:01:57,010 --> 00:02:02,710 the recommender systems module that Emily taught, that process 39 00:02:02,710 --> 00:02:07,020 with historical data is usually done in what's called a batch offline setting. 40 00:02:07,020 --> 00:02:10,820 You learn a model, say on your desktop machine or in a cluster, and 41 00:02:10,820 --> 00:02:14,340 then take the model and deploy it, for example, in the cloud, so 42 00:02:14,340 --> 00:02:16,970 that you can do what's called serving predictions. 43 00:02:16,970 --> 00:02:20,570 This is the online, real time part of the system. 44 00:02:20,570 --> 00:02:23,650 So, for example, if you have a website, if this is what we're serving for, 45 00:02:23,650 --> 00:02:26,610 the website is going to give me information about my users, 46 00:02:26,610 --> 00:02:28,870 what they're doing right now, what pages they looked, 47 00:02:28,870 --> 00:02:30,920 what products they're thinking about buying. 48 00:02:30,920 --> 00:02:34,820 And then from that, we're going to serve predictions in real time to say oh, 49 00:02:34,820 --> 00:02:37,700 check out this giraffe chew toy. 50 00:02:37,700 --> 00:02:38,790 Remember that? 51 00:02:38,790 --> 00:02:42,600 Check that out as this might be something that you want to buy right now. 52 00:02:42,600 --> 00:02:48,250 Now the user sees that giraffe offering and they may or may not buy the giraffe. 53 00:02:48,250 --> 00:02:50,740 That's the kind of feedback we're going to get back from the system. 54 00:02:50,740 --> 00:02:52,930 Did the user buy it or did they not buy it? 55 00:02:52,930 --> 00:02:55,660 That's going to influence both, our real time decisions. 56 00:02:55,660 --> 00:02:59,940 And go back to historical data and really be collected in the long run so 57 00:02:59,940 --> 00:03:04,410 that we can improve our model as we collect more and more data from the world. 58 00:03:04,410 --> 00:03:09,300 So that's what holistic deployment system for machine learning might look like. 59 00:03:09,300 --> 00:03:13,010 [MUSIC]