[MUSIC] Now, so far, we've really just talked about how do you build a machine learning model using data? How do you measure its quality? And how do you understand whether it's working or not? But that's not enough to build an intelligent application. When you build an intelligent application, you want to take all that you learn and then really put it in front of, say the customers of your company who are going to your website to buy a product. So what does the process actually look like? There are many different aspects associated with it. This is what we call model deployment, where we're taking the model and we allow it to serve predictions in real time. We'll talk more about that. Then we'll talk about how do we evaluate that model to make sure what we did and offline as we train the model is still good in the long run as we're using it. Then we have to think about all the management pieces. How do make sure the model is still good, how do we replace it when the model improves, and how do we react to measurements that we make, which are actually the metrics that we monitor over time to try to understand whether our models are still good or it's time to relearn them from data. Now all these pieces are really coming together. They're not just individual, separated ones, but interacting pieces that we need to understand deeply and create a cycle of improvement of the models that we have. So let's start with the first one. Let's start with deployment piece. Now consider the following setup. You're building a really cool new kind of product recommending system, and you're using a bunch of data with millions of product reviews and users. And you want to take the model that you learned on your laptop, in your desktop or in the big cluster of machines. And deploy it on the website that really interacts with your users. What does that look like? Well that whole system starts with some historical data. Some data that you've collected about the users of your system, the reviews they've written, what preferences they have. And we take that data and we feed it to train a model like we saw in the recommender systems module that Emily taught, that process with historical data is usually done in what's called a batch offline setting. You learn a model, say on your desktop machine or in a cluster, and then take the model and deploy it, for example, in the cloud, so that you can do what's called serving predictions. This is the online, real time part of the system. So, for example, if you have a website, if this is what we're serving for, the website is going to give me information about my users, what they're doing right now, what pages they looked, what products they're thinking about buying. And then from that, we're going to serve predictions in real time to say oh, check out this giraffe chew toy. Remember that? Check that out as this might be something that you want to buy right now. Now the user sees that giraffe offering and they may or may not buy the giraffe. That's the kind of feedback we're going to get back from the system. Did the user buy it or did they not buy it? That's going to influence both, our real time decisions. And go back to historical data and really be collected in the long run so that we can improve our model as we collect more and more data from the world. So that's what holistic deployment system for machine learning might look like. [MUSIC]