1 00:00:00,000 --> 00:00:02,490 You've heard about orthogonalization, 2 00:00:02,490 --> 00:00:05,145 how to set up your dev and test sets, 3 00:00:05,145 --> 00:00:07,530 human-level performance as a proxy for 4 00:00:07,530 --> 00:00:11,955 Bayes error and how to estimate your avoidable bias and variance. 5 00:00:11,955 --> 00:00:14,670 Let's pull it all together into a set of guidelines 6 00:00:14,670 --> 00:00:17,400 to how to improve the performance of your learning algorithm. 7 00:00:17,400 --> 00:00:20,970 So, I think getting a supervised learning algorithm to work well means 8 00:00:20,970 --> 00:00:25,425 fundamentally hoping or assuming they can do two things. 9 00:00:25,425 --> 00:00:29,810 First, is that you can fit the training set pretty well, 10 00:00:29,810 --> 00:00:38,170 and you can think of this as roughly saying that you can achieve low avoidable bias. 11 00:00:38,170 --> 00:00:41,025 And the second thing you're assuming you can do well, 12 00:00:41,025 --> 00:00:43,145 is that doing well on the training set 13 00:00:43,145 --> 00:00:46,790 generalizes pretty well to the dev set or the test set, 14 00:00:46,790 --> 00:00:51,310 and this is sort of saying that variance is not too bad. 15 00:00:51,310 --> 00:00:54,115 And in the spirit of orthogonalization, 16 00:00:54,115 --> 00:00:56,930 what you see is that there's a certain set of knobs you can 17 00:00:56,930 --> 00:00:59,660 use to fix avoidable bias issues, 18 00:00:59,660 --> 00:01:03,545 such as training a bigger network or training longer, 19 00:01:03,545 --> 00:01:09,770 and as a separate set of things you could use to address variance problems, 20 00:01:09,770 --> 00:01:12,770 such as regularization or getting more training data. 21 00:01:12,770 --> 00:01:17,780 So, to summarize up the process we've seen in the last several videos, 22 00:01:17,780 --> 00:01:21,920 if you want to improve the performance of your machine learning system, 23 00:01:21,920 --> 00:01:26,120 I would recommend looking at the difference between your training error and 24 00:01:26,120 --> 00:01:31,875 your proxy for Bayes error and just gives you a sense of the avoidable bias. 25 00:01:31,875 --> 00:01:34,280 In other words, just how much better do you 26 00:01:34,280 --> 00:01:37,055 think you should be trying to do on your training set. 27 00:01:37,055 --> 00:01:39,740 And then look at the difference between your dev error and 28 00:01:39,740 --> 00:01:44,120 your training error as an estimate of how much of a variance problem you have. 29 00:01:44,120 --> 00:01:46,160 In other words, how much harder you should be 30 00:01:46,160 --> 00:01:48,440 working to make your performance generalized from 31 00:01:48,440 --> 00:01:53,880 the training set to the dev set that it wasn't trained on explicitly. 32 00:01:57,050 --> 00:02:02,465 So, to whatever extent you want to try to reduce avoidable bias, 33 00:02:02,465 --> 00:02:07,510 I will try to apply tactics like train a bigger model. 34 00:02:07,510 --> 00:02:11,695 So, you can just do better on your training sets or train longer, 35 00:02:11,695 --> 00:02:13,705 use a better optimization algorithm, 36 00:02:13,705 --> 00:02:21,760 such as ADS momentum or RMSprop, 37 00:02:21,760 --> 00:02:26,610 or use a better algorithm like Adam, 38 00:02:27,730 --> 00:02:32,660 or one other thing you could try is to just find 39 00:02:32,660 --> 00:02:37,090 a better neural network architecture or better set of hyperparameters, 40 00:02:37,090 --> 00:02:39,920 and this could include everything from changing 41 00:02:39,920 --> 00:02:43,670 the activation function to changing the number of layers or hidden units. 42 00:02:43,670 --> 00:02:46,910 Although if you do that, it would be in the direction of increasing 43 00:02:46,910 --> 00:02:52,605 the model size to trying out other models or other model architectures, 44 00:02:52,605 --> 00:02:57,120 such as recurrent neural networks and convolutional neural networks, 45 00:02:57,120 --> 00:02:59,190 which we'll see in later courses. 46 00:02:59,190 --> 00:03:02,355 Whether or not a new neural network architecture will 47 00:03:02,355 --> 00:03:05,840 fit your training set better is sometimes hard to tell in advance, 48 00:03:05,840 --> 00:03:10,545 but sometimes you can get much better results with a better architecture. 49 00:03:10,545 --> 00:03:14,945 Next to the extent that you find out variance is a problem, 50 00:03:14,945 --> 00:03:20,660 some of the many techniques you could try then includes the following: you can try to get 51 00:03:20,660 --> 00:03:24,500 more data because getting more data to train on could help you 52 00:03:24,500 --> 00:03:29,250 generalize better to dev set data that your algorithm room didn't see, 53 00:03:29,250 --> 00:03:31,620 you could try regularization. 54 00:03:31,620 --> 00:03:39,740 So, this includes things like L2 regularization or dropout or data augmentation, 55 00:03:39,740 --> 00:03:45,990 which we talked about in the previous course, or once again, 56 00:03:45,990 --> 00:03:50,690 you can also try various NN architecture/hyperparameters search to see if 57 00:03:50,690 --> 00:03:52,310 that can help you find 58 00:03:52,310 --> 00:03:56,150 a neural network architecture that is better suited for your problem. 59 00:03:56,150 --> 00:04:00,480 I think that this notion of bias or avoidable bias and 60 00:04:00,480 --> 00:04:04,910 variance is one of those things that's easily learnt but tough to master. 61 00:04:04,910 --> 00:04:08,945 And you're able to systematically apply the concepts from this week's videos. 62 00:04:08,945 --> 00:04:11,990 You actually will be much more efficient and much more systematic and 63 00:04:11,990 --> 00:04:15,110 much more strategic than a lot 64 00:04:15,110 --> 00:04:17,915 of machine learning teams in terms of how to 65 00:04:17,915 --> 00:04:22,400 systematically go about improving the performance of your machine learning system. 66 00:04:22,400 --> 00:04:26,150 So, that this week's homework will allow you 67 00:04:26,150 --> 00:04:30,515 to practice and exercise more your understanding of these concepts. 68 00:04:30,515 --> 00:04:32,540 Best of luck with this week's homework, 69 00:04:32,540 --> 00:04:36,130 and I look forward to also seeing you in next week's videos.