1
00:00:00,000 --> 00:00:03,075
[inaudible] teams often find it exciting to surpass

2
00:00:03,075 --> 00:00:07,305
human-level performance on the specific recreational classification task.

3
00:00:07,305 --> 00:00:12,355
Let's talk over some of the things you see if you try to accomplish this yourself.

4
00:00:12,355 --> 00:00:15,570
We've discussed before how machine learning progress gets

5
00:00:15,570 --> 00:00:19,895
harder as you approach or even surpass human-level performance.

6
00:00:19,895 --> 00:00:23,480
Let's talk over one more example of why that's the case.

7
00:00:23,480 --> 00:00:26,820
Let's say you have a problem where a team of humans discussing and

8
00:00:26,820 --> 00:00:30,105
debating achieves 0.5% error,

9
00:00:30,105 --> 00:00:32,465
a single human 1% error,

10
00:00:32,465 --> 00:00:38,785
and you have an algorithm of 0.6% training error and 0.8% dev error.

11
00:00:38,785 --> 00:00:40,440
So in this case,

12
00:00:40,440 --> 00:00:46,093
what is the avoidable bias?

13
00:00:46,093 --> 00:00:50,460
So this one is relatively easier to answer,

14
00:00:50,460 --> 00:00:53,420
0.5% is your estimate of base error,

15
00:00:53,420 --> 00:00:54,990
so your avoidable bias is,

16
00:00:54,990 --> 00:00:57,360
you're not going to use this 1% number as reference,

17
00:00:57,360 --> 00:00:58,625
you can use this difference,

18
00:00:58,625 --> 00:01:06,360
so maybe you estimate your avoidable bias is at least 0.1% and your variance as 0.2%.

19
00:01:06,360 --> 00:01:13,805
So there's maybe more to do to reduce your variance than your avoidable bias perhaps.

20
00:01:13,805 --> 00:01:16,795
But now let's take a harder example, let's say,

21
00:01:16,795 --> 00:01:20,730
a team of humans and single human performance, the same as before,

22
00:01:20,730 --> 00:01:24,300
but your algorithm gets 0.3% training error,

23
00:01:24,300 --> 00:01:28,635
and 0.4% dev error.

24
00:01:28,635 --> 00:01:31,405
Now, what is the avoidable bias?

25
00:01:31,405 --> 00:01:34,425
It's now actually much harder to answer that.

26
00:01:34,425 --> 00:01:36,940
Is the fact that your training error,

27
00:01:36,940 --> 00:01:41,415
0.3%, does this mean you've over-fitted by 0.2%,

28
00:01:41,415 --> 00:01:44,505
or is base error, actually 0.1%,

29
00:01:44,505 --> 00:01:46,740
or maybe is base error 0.2%,

30
00:01:46,740 --> 00:01:49,455
or maybe base error is 0.3%?

31
00:01:49,455 --> 00:01:51,564
You don't really know,

32
00:01:51,564 --> 00:01:56,935
but based on the information given in this example,

33
00:01:56,935 --> 00:02:01,150
you actually don't have enough information

34
00:02:01,150 --> 00:02:05,845
to tell if you should focus on reducing bias or reducing variance in your algorithm.

35
00:02:05,845 --> 00:02:10,810
So that slows down the efficiency where you should make progress.

36
00:02:10,810 --> 00:02:15,005
Moreover, if your error is already better than

37
00:02:15,005 --> 00:02:20,020
even a team of humans looking at and discussing and debating the right label,

38
00:02:20,020 --> 00:02:25,150
for an example, then it's just also harder to rely on human intuition to

39
00:02:25,150 --> 00:02:27,939
tell your algorithm what are ways that your algorithm could

40
00:02:27,939 --> 00:02:30,970
still improve the performance?

41
00:02:30,970 --> 00:02:32,275
So in this example,

42
00:02:32,275 --> 00:02:35,950
once you've surpassed this 0.5% threshold,

43
00:02:35,950 --> 00:02:38,920
your options, your ways of making progress on

44
00:02:38,920 --> 00:02:43,540
the machine learning problem are just less clear.

45
00:02:43,540 --> 00:02:45,207
It doesn't mean you can't make progress,

46
00:02:45,207 --> 00:02:48,655
you might still be able to make significant progress,

47
00:02:48,655 --> 00:02:51,130
but some of the tools you have for

48
00:02:51,130 --> 00:02:55,720
pointing you in a clear direction just don't work as well.

49
00:02:55,720 --> 00:02:58,690
Now, there are many problems where machine learning

50
00:02:58,690 --> 00:03:02,365
significantly surpasses human-level performance.

51
00:03:02,365 --> 00:03:03,970
For example, I think,

52
00:03:03,970 --> 00:03:08,515
online advertising, estimating how likely someone is to click on that.

53
00:03:08,515 --> 00:03:12,685
Probably, learning algorithms do that much better today than any human could,

54
00:03:12,685 --> 00:03:14,505
or making product recommendations,

55
00:03:14,505 --> 00:03:17,690
recommending movies or books to you.

56
00:03:17,690 --> 00:03:20,290
I think that web sites today can do that much

57
00:03:20,290 --> 00:03:23,510
better than maybe even your closest friends can.

58
00:03:23,510 --> 00:03:26,800
All logistics predicting how long will take you to drive from A to

59
00:03:26,800 --> 00:03:30,580
B or predicting how long to take a delivery vehicle to drive from A to B,

60
00:03:30,580 --> 00:03:34,795
or trying to predict whether someone will repay a loan,

61
00:03:34,795 --> 00:03:39,155
and therefore, whether or not you should approve a loan offer.

62
00:03:39,155 --> 00:03:42,430
All of these are problems where I think today machine

63
00:03:42,430 --> 00:03:46,930
learning far surpasses a single human's performance.

64
00:03:46,930 --> 00:03:49,450
Notice something about these four examples.

65
00:03:49,450 --> 00:03:53,970
All four of these examples are actually learning from structured data,

66
00:03:53,970 --> 00:03:58,250
where you might have a database of what has users clicked on,

67
00:03:58,250 --> 00:04:00,520
database of proper support for,

68
00:04:00,520 --> 00:04:03,022
databases of how long it takes to get from A to B,

69
00:04:03,022 --> 00:04:07,825
database of previous loan applications and their outcomes.

70
00:04:07,825 --> 00:04:11,740
And these are not natural perception problems,

71
00:04:11,740 --> 00:04:14,395
so these are not computer vision,

72
00:04:14,395 --> 00:04:18,980
or speech recognition, or natural language processing task.

73
00:04:18,980 --> 00:04:23,260
Humans tend to be very good in natural perception task.

74
00:04:23,260 --> 00:04:25,090
So it is possible,

75
00:04:25,090 --> 00:04:27,580
but it's just a bit harder for computers to

76
00:04:27,580 --> 00:04:31,275
surpass human-level performance on natural perception task.

77
00:04:31,275 --> 00:04:34,360
And finally, all of these are problems where there are

78
00:04:34,360 --> 00:04:38,555
teams that have access to huge amounts of data.

79
00:04:38,555 --> 00:04:43,480
So for example, the best systems for all four of these applications have probably

80
00:04:43,480 --> 00:04:49,075
looked at far more data of that application than any human could possibly look at.

81
00:04:49,075 --> 00:04:51,910
And so, that's also made it relatively

82
00:04:51,910 --> 00:04:56,450
easy for a computer to surpass human-level performance.

83
00:04:56,450 --> 00:04:59,910
Now, the fact that there's so much data that computer could examine,

84
00:04:59,910 --> 00:05:04,525
so it can petrifies that's called patterns than even the human mind.

85
00:05:04,525 --> 00:05:06,650
Other than these problems,

86
00:05:06,650 --> 00:05:12,340
today there are speech recognition systems that can surpass human-level performance.

87
00:05:12,340 --> 00:05:15,340
And there are also some computer vision,

88
00:05:15,340 --> 00:05:17,740
some image recognition tasks,

89
00:05:17,740 --> 00:05:21,670
where computers have surpassed human-level performance.

90
00:05:21,670 --> 00:05:25,060
But because humans are very good at this natural perception task,

91
00:05:25,060 --> 00:05:28,160
I think it was harder for computers to get there.

92
00:05:28,160 --> 00:05:30,595
And then there are some medical tasks,

93
00:05:30,595 --> 00:05:34,585
for example, reading ECGs or diagnosing skin cancer,

94
00:05:34,585 --> 00:05:37,570
or certain narrow radiology task,

95
00:05:37,570 --> 00:05:40,300
where computers are getting really good and

96
00:05:40,300 --> 00:05:44,310
maybe surpassing a single human-level performance.

97
00:05:44,310 --> 00:05:46,510
And I guess one of the exciting things about

98
00:05:46,510 --> 00:05:48,970
recent advances in deep learning is that even for

99
00:05:48,970 --> 00:05:53,935
these tasks we can now surpass human-level performance in some cases,

100
00:05:53,935 --> 00:05:56,650
but it has been a bit harder because humans

101
00:05:56,650 --> 00:06:00,070
tend to be very good at this natural perception task.

102
00:06:00,070 --> 00:06:04,285
So surpassing human-level performance is often not easy,

103
00:06:04,285 --> 00:06:08,290
but given enough data there've been lots of deep learning systems

104
00:06:08,290 --> 00:06:12,455
have surpassed human-level performance on a single supervisory problem.

105
00:06:12,455 --> 00:06:15,120
So that makes sense for an application you're working on.

106
00:06:15,120 --> 00:06:17,660
I hope that maybe someday you manage to get

107
00:06:17,660 --> 00:06:21,300
your deep learning system to also surpass human-level performance.