1
00:00:00,000 --> 00:00:02,490
You've heard about orthogonalization,

2
00:00:02,490 --> 00:00:05,145
how to set up your dev and test sets,

3
00:00:05,145 --> 00:00:07,530
human-level performance as a proxy for

4
00:00:07,530 --> 00:00:11,955
Bayes error and how to estimate your avoidable bias and variance.

5
00:00:11,955 --> 00:00:14,670
Let's pull it all together into a set of guidelines

6
00:00:14,670 --> 00:00:17,400
to how to improve the performance of your learning algorithm.

7
00:00:17,400 --> 00:00:20,970
So, I think getting a supervised learning algorithm to work well means

8
00:00:20,970 --> 00:00:25,425
fundamentally hoping or assuming they can do two things.

9
00:00:25,425 --> 00:00:29,810
First, is that you can fit the training set pretty well,

10
00:00:29,810 --> 00:00:38,170
and you can think of this as roughly saying that you can achieve low avoidable bias.

11
00:00:38,170 --> 00:00:41,025
And the second thing you're assuming you can do well,

12
00:00:41,025 --> 00:00:43,145
is that doing well on the training set

13
00:00:43,145 --> 00:00:46,790
generalizes pretty well to the dev set or the test set,

14
00:00:46,790 --> 00:00:51,310
and this is sort of saying that variance is not too bad.

15
00:00:51,310 --> 00:00:54,115
And in the spirit of orthogonalization,

16
00:00:54,115 --> 00:00:56,930
what you see is that there's a certain set of knobs you can

17
00:00:56,930 --> 00:00:59,660
use to fix avoidable bias issues,

18
00:00:59,660 --> 00:01:03,545
such as training a bigger network or training longer,

19
00:01:03,545 --> 00:01:09,770
and as a separate set of things you could use to address variance problems,

20
00:01:09,770 --> 00:01:12,770
such as regularization or getting more training data.

21
00:01:12,770 --> 00:01:17,780
So, to summarize up the process we've seen in the last several videos,

22
00:01:17,780 --> 00:01:21,920
if you want to improve the performance of your machine learning system,

23
00:01:21,920 --> 00:01:26,120
I would recommend looking at the difference between your training error and

24
00:01:26,120 --> 00:01:31,875
your proxy for Bayes error and just gives you a sense of the avoidable bias.

25
00:01:31,875 --> 00:01:34,280
In other words, just how much better do you

26
00:01:34,280 --> 00:01:37,055
think you should be trying to do on your training set.

27
00:01:37,055 --> 00:01:39,740
And then look at the difference between your dev error and

28
00:01:39,740 --> 00:01:44,120
your training error as an estimate of how much of a variance problem you have.

29
00:01:44,120 --> 00:01:46,160
In other words, how much harder you should be

30
00:01:46,160 --> 00:01:48,440
working to make your performance generalized from

31
00:01:48,440 --> 00:01:53,880
the training set to the dev set that it wasn't trained on explicitly.

32
00:01:57,050 --> 00:02:02,465
So, to whatever extent you want to try to reduce avoidable bias,

33
00:02:02,465 --> 00:02:07,510
I will try to apply tactics like train a bigger model.

34
00:02:07,510 --> 00:02:11,695
So, you can just do better on your training sets or train longer,

35
00:02:11,695 --> 00:02:13,705
use a better optimization algorithm,

36
00:02:13,705 --> 00:02:21,760
such as ADS momentum or RMSprop,

37
00:02:21,760 --> 00:02:26,610
or use a better algorithm like Adam,

38
00:02:27,730 --> 00:02:32,660
or one other thing you could try is to just find

39
00:02:32,660 --> 00:02:37,090
a better neural network architecture or better set of hyperparameters,

40
00:02:37,090 --> 00:02:39,920
and this could include everything from changing

41
00:02:39,920 --> 00:02:43,670
the activation function to changing the number of layers or hidden units.

42
00:02:43,670 --> 00:02:46,910
Although if you do that, it would be in the direction of increasing

43
00:02:46,910 --> 00:02:52,605
the model size to trying out other models or other model architectures,

44
00:02:52,605 --> 00:02:57,120
such as recurrent neural networks and convolutional neural networks,

45
00:02:57,120 --> 00:02:59,190
which we'll see in later courses.

46
00:02:59,190 --> 00:03:02,355
Whether or not a new neural network architecture will

47
00:03:02,355 --> 00:03:05,840
fit your training set better is sometimes hard to tell in advance,

48
00:03:05,840 --> 00:03:10,545
but sometimes you can get much better results with a better architecture.

49
00:03:10,545 --> 00:03:14,945
Next to the extent that you find out variance is a problem,

50
00:03:14,945 --> 00:03:20,660
some of the many techniques you could try then includes the following: you can try to get

51
00:03:20,660 --> 00:03:24,500
more data because getting more data to train on could help you

52
00:03:24,500 --> 00:03:29,250
generalize better to dev set data that your algorithm room didn't see,

53
00:03:29,250 --> 00:03:31,620
you could try regularization.

54
00:03:31,620 --> 00:03:39,740
So, this includes things like L2 regularization or dropout or data augmentation,

55
00:03:39,740 --> 00:03:45,990
which we talked about in the previous course, or once again,

56
00:03:45,990 --> 00:03:50,690
you can also try various NN architecture/hyperparameters search to see if

57
00:03:50,690 --> 00:03:52,310
that can help you find

58
00:03:52,310 --> 00:03:56,150
a neural network architecture that is better suited for your problem.

59
00:03:56,150 --> 00:04:00,480
I think that this notion of bias or avoidable bias and

60
00:04:00,480 --> 00:04:04,910
variance is one of those things that's easily learnt but tough to master.

61
00:04:04,910 --> 00:04:08,945
And you're able to systematically apply the concepts from this week's videos.

62
00:04:08,945 --> 00:04:11,990
You actually will be much more efficient and much more systematic and

63
00:04:11,990 --> 00:04:15,110
much more strategic than a lot

64
00:04:15,110 --> 00:04:17,915
of machine learning teams in terms of how to

65
00:04:17,915 --> 00:04:22,400
systematically go about improving the performance of your machine learning system.

66
00:04:22,400 --> 00:04:26,150
So, that this week's homework will allow you

67
00:04:26,150 --> 00:04:30,515
to practice and exercise more your understanding of these concepts.

68
00:04:30,515 --> 00:04:32,540
Best of luck with this week's homework,

69
00:04:32,540 --> 00:04:36,130
and I look forward to also seeing you in next week's videos.