1
00:00:00,383 --> 00:00:04,463
[MUSIC]

2
00:00:04,463 --> 00:00:05,358
Now, so far,

3
00:00:05,358 --> 00:00:11,030
we've really just talked about how do you
build a machine learning model using data?

4
00:00:11,030 --> 00:00:12,850
How do you measure its quality?

5
00:00:12,850 --> 00:00:14,950
And how do you understand
whether it's working or not?

6
00:00:14,950 --> 00:00:17,140
But that's not enough to build
an intelligent application.

7
00:00:17,140 --> 00:00:20,704
When you build an intelligent application,
you want to take all that you learn and

8
00:00:20,704 --> 00:00:22,690
then really put it in front of,

9
00:00:22,690 --> 00:00:26,960
say the customers of your company who are
going to your website to buy a product.

10
00:00:26,960 --> 00:00:29,130
So what does the process
actually look like?

11
00:00:29,130 --> 00:00:32,285
There are many different
aspects associated with it.

12
00:00:32,285 --> 00:00:35,505
This is what we call model deployment,
where we're taking the model and

13
00:00:35,505 --> 00:00:38,605
we allow it to serve
predictions in real time.

14
00:00:38,605 --> 00:00:40,085
We'll talk more about that.

15
00:00:40,085 --> 00:00:44,782
Then we'll talk about how do we evaluate
that model to make sure what we did and

16
00:00:44,782 --> 00:00:49,222
offline as we train the model is still
good in the long run as we're using it.

17
00:00:49,222 --> 00:00:52,302
Then we have to think about
all the management pieces.

18
00:00:52,302 --> 00:00:56,002
How do make sure the model is still good,
how do we replace it when the model

19
00:00:56,002 --> 00:00:59,146
improves, and how do we react
to measurements that we make,

20
00:00:59,146 --> 00:01:03,056
which are actually the metrics that we
monitor over time to try to understand

21
00:01:03,056 --> 00:01:06,556
whether our models are still good or
it's time to relearn them from data.

22
00:01:06,556 --> 00:01:09,326
Now all these pieces
are really coming together.

23
00:01:09,326 --> 00:01:14,216
They're not just individual, separated
ones, but interacting pieces that we need

24
00:01:14,216 --> 00:01:18,796
to understand deeply and create a cycle of
improvement of the models that we have.

25
00:01:18,796 --> 00:01:19,976
So let's start with the first one.

26
00:01:19,976 --> 00:01:22,210
Let's start with deployment piece.

27
00:01:22,210 --> 00:01:24,030
Now consider the following setup.

28
00:01:24,030 --> 00:01:27,600
You're building a really cool new kind
of product recommending system, and

29
00:01:27,600 --> 00:01:31,810
you're using a bunch of data with
millions of product reviews and users.

30
00:01:31,810 --> 00:01:35,915
And you want to take the model that you
learned on your laptop, in your desktop or

31
00:01:35,915 --> 00:01:37,785
in the big cluster of machines.

32
00:01:37,785 --> 00:01:41,585
And deploy it on the website that
really interacts with your users.

33
00:01:41,585 --> 00:01:42,965
What does that look like?

34
00:01:42,965 --> 00:01:46,233
Well that whole system starts
with some historical data.

35
00:01:46,233 --> 00:01:50,035
Some data that you've collected
about the users of your system,

36
00:01:50,035 --> 00:01:52,205
the reviews they've written,
what preferences they have.

37
00:01:52,205 --> 00:01:57,010
And we take that data and
we feed it to train a model like we saw in

38
00:01:57,010 --> 00:02:02,710
the recommender systems module
that Emily taught, that process

39
00:02:02,710 --> 00:02:07,020
with historical data is usually done in
what's called a batch offline setting.

40
00:02:07,020 --> 00:02:10,820
You learn a model, say on your
desktop machine or in a cluster, and

41
00:02:10,820 --> 00:02:14,340
then take the model and deploy it,
for example, in the cloud, so

42
00:02:14,340 --> 00:02:16,970
that you can do what's
called serving predictions.

43
00:02:16,970 --> 00:02:20,570
This is the online,
real time part of the system.

44
00:02:20,570 --> 00:02:23,650
So, for example, if you have a website,
if this is what we're serving for,

45
00:02:23,650 --> 00:02:26,610
the website is going to give
me information about my users,

46
00:02:26,610 --> 00:02:28,870
what they're doing right now,
what pages they looked,

47
00:02:28,870 --> 00:02:30,920
what products they're
thinking about buying.

48
00:02:30,920 --> 00:02:34,820
And then from that, we're going to serve
predictions in real time to say oh,

49
00:02:34,820 --> 00:02:37,700
check out this giraffe chew toy.

50
00:02:37,700 --> 00:02:38,790
Remember that?

51
00:02:38,790 --> 00:02:42,600
Check that out as this might be something
that you want to buy right now.

52
00:02:42,600 --> 00:02:48,250
Now the user sees that giraffe offering
and they may or may not buy the giraffe.

53
00:02:48,250 --> 00:02:50,740
That's the kind of feedback we're
going to get back from the system.

54
00:02:50,740 --> 00:02:52,930
Did the user buy it or
did they not buy it?

55
00:02:52,930 --> 00:02:55,660
That's going to influence both,
our real time decisions.

56
00:02:55,660 --> 00:02:59,940
And go back to historical data and
really be collected in the long run so

57
00:02:59,940 --> 00:03:04,410
that we can improve our model as we
collect more and more data from the world.

58
00:03:04,410 --> 00:03:09,300
So that's what holistic deployment system
for machine learning might look like.

59
00:03:09,300 --> 00:03:13,010
[MUSIC]