1 00:00:00,000 --> 00:00:03,075 [inaudible] teams often find it exciting to surpass 2 00:00:03,075 --> 00:00:07,305 human-level performance on the specific recreational classification task. 3 00:00:07,305 --> 00:00:12,355 Let's talk over some of the things you see if you try to accomplish this yourself. 4 00:00:12,355 --> 00:00:15,570 We've discussed before how machine learning progress gets 5 00:00:15,570 --> 00:00:19,895 harder as you approach or even surpass human-level performance. 6 00:00:19,895 --> 00:00:23,480 Let's talk over one more example of why that's the case. 7 00:00:23,480 --> 00:00:26,820 Let's say you have a problem where a team of humans discussing and 8 00:00:26,820 --> 00:00:30,105 debating achieves 0.5% error, 9 00:00:30,105 --> 00:00:32,465 a single human 1% error, 10 00:00:32,465 --> 00:00:38,785 and you have an algorithm of 0.6% training error and 0.8% dev error. 11 00:00:38,785 --> 00:00:40,440 So in this case, 12 00:00:40,440 --> 00:00:46,093 what is the avoidable bias? 13 00:00:46,093 --> 00:00:50,460 So this one is relatively easier to answer, 14 00:00:50,460 --> 00:00:53,420 0.5% is your estimate of base error, 15 00:00:53,420 --> 00:00:54,990 so your avoidable bias is, 16 00:00:54,990 --> 00:00:57,360 you're not going to use this 1% number as reference, 17 00:00:57,360 --> 00:00:58,625 you can use this difference, 18 00:00:58,625 --> 00:01:06,360 so maybe you estimate your avoidable bias is at least 0.1% and your variance as 0.2%. 19 00:01:06,360 --> 00:01:13,805 So there's maybe more to do to reduce your variance than your avoidable bias perhaps. 20 00:01:13,805 --> 00:01:16,795 But now let's take a harder example, let's say, 21 00:01:16,795 --> 00:01:20,730 a team of humans and single human performance, the same as before, 22 00:01:20,730 --> 00:01:24,300 but your algorithm gets 0.3% training error, 23 00:01:24,300 --> 00:01:28,635 and 0.4% dev error. 24 00:01:28,635 --> 00:01:31,405 Now, what is the avoidable bias? 25 00:01:31,405 --> 00:01:34,425 It's now actually much harder to answer that. 26 00:01:34,425 --> 00:01:36,940 Is the fact that your training error, 27 00:01:36,940 --> 00:01:41,415 0.3%, does this mean you've over-fitted by 0.2%, 28 00:01:41,415 --> 00:01:44,505 or is base error, actually 0.1%, 29 00:01:44,505 --> 00:01:46,740 or maybe is base error 0.2%, 30 00:01:46,740 --> 00:01:49,455 or maybe base error is 0.3%? 31 00:01:49,455 --> 00:01:51,564 You don't really know, 32 00:01:51,564 --> 00:01:56,935 but based on the information given in this example, 33 00:01:56,935 --> 00:02:01,150 you actually don't have enough information 34 00:02:01,150 --> 00:02:05,845 to tell if you should focus on reducing bias or reducing variance in your algorithm. 35 00:02:05,845 --> 00:02:10,810 So that slows down the efficiency where you should make progress. 36 00:02:10,810 --> 00:02:15,005 Moreover, if your error is already better than 37 00:02:15,005 --> 00:02:20,020 even a team of humans looking at and discussing and debating the right label, 38 00:02:20,020 --> 00:02:25,150 for an example, then it's just also harder to rely on human intuition to 39 00:02:25,150 --> 00:02:27,939 tell your algorithm what are ways that your algorithm could 40 00:02:27,939 --> 00:02:30,970 still improve the performance? 41 00:02:30,970 --> 00:02:32,275 So in this example, 42 00:02:32,275 --> 00:02:35,950 once you've surpassed this 0.5% threshold, 43 00:02:35,950 --> 00:02:38,920 your options, your ways of making progress on 44 00:02:38,920 --> 00:02:43,540 the machine learning problem are just less clear. 45 00:02:43,540 --> 00:02:45,207 It doesn't mean you can't make progress, 46 00:02:45,207 --> 00:02:48,655 you might still be able to make significant progress, 47 00:02:48,655 --> 00:02:51,130 but some of the tools you have for 48 00:02:51,130 --> 00:02:55,720 pointing you in a clear direction just don't work as well. 49 00:02:55,720 --> 00:02:58,690 Now, there are many problems where machine learning 50 00:02:58,690 --> 00:03:02,365 significantly surpasses human-level performance. 51 00:03:02,365 --> 00:03:03,970 For example, I think, 52 00:03:03,970 --> 00:03:08,515 online advertising, estimating how likely someone is to click on that. 53 00:03:08,515 --> 00:03:12,685 Probably, learning algorithms do that much better today than any human could, 54 00:03:12,685 --> 00:03:14,505 or making product recommendations, 55 00:03:14,505 --> 00:03:17,690 recommending movies or books to you. 56 00:03:17,690 --> 00:03:20,290 I think that web sites today can do that much 57 00:03:20,290 --> 00:03:23,510 better than maybe even your closest friends can. 58 00:03:23,510 --> 00:03:26,800 All logistics predicting how long will take you to drive from A to 59 00:03:26,800 --> 00:03:30,580 B or predicting how long to take a delivery vehicle to drive from A to B, 60 00:03:30,580 --> 00:03:34,795 or trying to predict whether someone will repay a loan, 61 00:03:34,795 --> 00:03:39,155 and therefore, whether or not you should approve a loan offer. 62 00:03:39,155 --> 00:03:42,430 All of these are problems where I think today machine 63 00:03:42,430 --> 00:03:46,930 learning far surpasses a single human's performance. 64 00:03:46,930 --> 00:03:49,450 Notice something about these four examples. 65 00:03:49,450 --> 00:03:53,970 All four of these examples are actually learning from structured data, 66 00:03:53,970 --> 00:03:58,250 where you might have a database of what has users clicked on, 67 00:03:58,250 --> 00:04:00,520 database of proper support for, 68 00:04:00,520 --> 00:04:03,022 databases of how long it takes to get from A to B, 69 00:04:03,022 --> 00:04:07,825 database of previous loan applications and their outcomes. 70 00:04:07,825 --> 00:04:11,740 And these are not natural perception problems, 71 00:04:11,740 --> 00:04:14,395 so these are not computer vision, 72 00:04:14,395 --> 00:04:18,980 or speech recognition, or natural language processing task. 73 00:04:18,980 --> 00:04:23,260 Humans tend to be very good in natural perception task. 74 00:04:23,260 --> 00:04:25,090 So it is possible, 75 00:04:25,090 --> 00:04:27,580 but it's just a bit harder for computers to 76 00:04:27,580 --> 00:04:31,275 surpass human-level performance on natural perception task. 77 00:04:31,275 --> 00:04:34,360 And finally, all of these are problems where there are 78 00:04:34,360 --> 00:04:38,555 teams that have access to huge amounts of data. 79 00:04:38,555 --> 00:04:43,480 So for example, the best systems for all four of these applications have probably 80 00:04:43,480 --> 00:04:49,075 looked at far more data of that application than any human could possibly look at. 81 00:04:49,075 --> 00:04:51,910 And so, that's also made it relatively 82 00:04:51,910 --> 00:04:56,450 easy for a computer to surpass human-level performance. 83 00:04:56,450 --> 00:04:59,910 Now, the fact that there's so much data that computer could examine, 84 00:04:59,910 --> 00:05:04,525 so it can petrifies that's called patterns than even the human mind. 85 00:05:04,525 --> 00:05:06,650 Other than these problems, 86 00:05:06,650 --> 00:05:12,340 today there are speech recognition systems that can surpass human-level performance. 87 00:05:12,340 --> 00:05:15,340 And there are also some computer vision, 88 00:05:15,340 --> 00:05:17,740 some image recognition tasks, 89 00:05:17,740 --> 00:05:21,670 where computers have surpassed human-level performance. 90 00:05:21,670 --> 00:05:25,060 But because humans are very good at this natural perception task, 91 00:05:25,060 --> 00:05:28,160 I think it was harder for computers to get there. 92 00:05:28,160 --> 00:05:30,595 And then there are some medical tasks, 93 00:05:30,595 --> 00:05:34,585 for example, reading ECGs or diagnosing skin cancer, 94 00:05:34,585 --> 00:05:37,570 or certain narrow radiology task, 95 00:05:37,570 --> 00:05:40,300 where computers are getting really good and 96 00:05:40,300 --> 00:05:44,310 maybe surpassing a single human-level performance. 97 00:05:44,310 --> 00:05:46,510 And I guess one of the exciting things about 98 00:05:46,510 --> 00:05:48,970 recent advances in deep learning is that even for 99 00:05:48,970 --> 00:05:53,935 these tasks we can now surpass human-level performance in some cases, 100 00:05:53,935 --> 00:05:56,650 but it has been a bit harder because humans 101 00:05:56,650 --> 00:06:00,070 tend to be very good at this natural perception task. 102 00:06:00,070 --> 00:06:04,285 So surpassing human-level performance is often not easy, 103 00:06:04,285 --> 00:06:08,290 but given enough data there've been lots of deep learning systems 104 00:06:08,290 --> 00:06:12,455 have surpassed human-level performance on a single supervisory problem. 105 00:06:12,455 --> 00:06:15,120 So that makes sense for an application you're working on. 106 00:06:15,120 --> 00:06:17,660 I hope that maybe someday you manage to get 107 00:06:17,660 --> 00:06:21,300 your deep learning system to also surpass human-level performance.