[MUSIC] Now, so far, we've really just talked about how do you
build a machine learning model using data? How do you measure its quality? And how do you understand
whether it's working or not? But that's not enough to build
an intelligent application. When you build an intelligent application,
you want to take all that you learn and then really put it in front of, say the customers of your company who are
going to your website to buy a product. So what does the process
actually look like? There are many different
aspects associated with it. This is what we call model deployment,
where we're taking the model and we allow it to serve
predictions in real time. We'll talk more about that. Then we'll talk about how do we evaluate
that model to make sure what we did and offline as we train the model is still
good in the long run as we're using it. Then we have to think about
all the management pieces. How do make sure the model is still good,
how do we replace it when the model improves, and how do we react
to measurements that we make, which are actually the metrics that we
monitor over time to try to understand whether our models are still good or
it's time to relearn them from data. Now all these pieces
are really coming together. They're not just individual, separated
ones, but interacting pieces that we need to understand deeply and create a cycle of
improvement of the models that we have. So let's start with the first one. Let's start with deployment piece. Now consider the following setup. You're building a really cool new kind
of product recommending system, and you're using a bunch of data with
millions of product reviews and users. And you want to take the model that you
learned on your laptop, in your desktop or in the big cluster of machines. And deploy it on the website that
really interacts with your users. What does that look like? Well that whole system starts
with some historical data. Some data that you've collected
about the users of your system, the reviews they've written,
what preferences they have. And we take that data and
we feed it to train a model like we saw in the recommender systems module
that Emily taught, that process with historical data is usually done in
what's called a batch offline setting. You learn a model, say on your
desktop machine or in a cluster, and then take the model and deploy it,
for example, in the cloud, so that you can do what's
called serving predictions. This is the online,
real time part of the system. So, for example, if you have a website,
if this is what we're serving for, the website is going to give
me information about my users, what they're doing right now,
what pages they looked, what products they're
thinking about buying. And then from that, we're going to serve
predictions in real time to say oh, check out this giraffe chew toy. Remember that? Check that out as this might be something
that you want to buy right now. Now the user sees that giraffe offering
and they may or may not buy the giraffe. That's the kind of feedback we're
going to get back from the system. Did the user buy it or
did they not buy it? That's going to influence both,
our real time decisions. And go back to historical data and
really be collected in the long run so that we can improve our model as we
collect more and more data from the world. So that's what holistic deployment system
for machine learning might look like. [MUSIC]