If you run the learning algorithm and it doesn't do as well as you are hoping, almost all the time it will be because you have either a high bias problem or a high variance problem. In other words they're either an underfitting problem or an overfitting problem. And in this case it's very important to figure out which of these two problems is bias or variance or a bit of both that you actually have. Because knowing which of these two things is happening would give a very strong indicator for whether the useful and promising ways to try to improve your algorithm. In this video, I would like to delve more deeply into this bias and various issue and understand them better as well as figure out how to look at and evaluate knows whether or not we might have a bias problem or a variance problem. Since this would be critical to figuring out how to improve the performance of learning algorithm that you implement. So you've already seen this figure a few times, where if you fit two simple hypothesis, like a straight line that that underfits the data. If you fit a two complex hypothesis, then that might fit the training set perfectly but overfit the data and this may be hypothesis of some intermediate level of complexity, of some, maybe degree two polynomials are not too low and not too high degree. That's just right. And gives you the best generalization error out of these options. Now that we're armed with the notion of training and validation in test sets, we can understand the concepts of bias and variance a little bit better. Concretely, let our training error and cross validation error be defined as in the previous videos, just say, the squared error, the average squared error as measured on the 20 sets or as measured on the cross validation set. Now let's plot the following figure. On the horizontal axis I am going to plot the degree of polynomial, so as I go the right I'm going to be fitting higher and higher order polynomials. So, we'll do that for this figure, where maybe d equals 1, were going to be fitting very simple functions where as we are the right of this this may be d equals 4 or relatively may be even larger numbers. I'm going to be fitting very complex high order polynomials that might fit the training set with much more complex functions whereas we're here on the right of the horizontal axis, I have much larger values of these of a much higher degree polynomial, and so here that is going to correspond to fitting much more complex functions to your training set. Let's look at the training error and cause-validation error and plot them on this figure. Let's start with the training error. As we increase the degree of the polynomial, we're going to fit our training set better and better and so, if d equals 1 that ever rose to the high training error. If we have a very high degree of polynomial, our training error is going to be really low. Maybe even zero, because it will fit the training set really well. And so as we increase of the greater polynomial we find typically that the training error decreases, so I'm going to write j subscript train of theta there, because our training error tends to decrease with the degree of the polynomial that we fit to the data. Next, let's look at the cross validation error. Often that matter, if we look at the test set error we'll get a pretty similar result as if we were to plot the cross validation error. So, we know that if d equals 1, we're fitting a very simple function, and so we may be underfitting the training set, and so we're going to go very high cross-validation error. If we fit, you know, an intermediate degree polynomial; we have a d equals 2 in our example in the previous slide, we are going to have a much lower cross-validation error, because we are just fitting, finding a much better fit to the data. And conversely if d were too high, so if d took on say a value of four, then we're again overfitting and so we end up with a high value for cross-validation error. So if you were to vary this smoothly and plot a curve you might end up with a curve like that, where that's Jcv of theta, and again if you plot j test of theta you get something very similar. And so this sort of plot also helps us to better understand the notions of bias and variance. Concretely, if you have a learning algorithm that's not performing as well as you wanted it to, how can you figure out if your learning algorithm is suffering. Concretly, suppose you have applied a learning algorithm and it is not performing as well as your are hoping, so your cross-validation set error or your test set error is high. How can we figure out if the learning algorithm is suffering from high bias or if it is suffering from high variance. So the setting of a cross-validation error being high corresponds to either this regime or this regime. So this regime on the left corresponds to a high bias problem, that is, if you are fitting an overly low order polynomial such as a plus one, when we really needed a higher order polynomial to fit the data. Whereas in contrast, this regime corresponds to a high variance problem. That is, if d--the degree of polynomial--was too large for the data set that we have. And this figure gives us a clue for how to distinguish between these two cases. Concretely, for the high bias case, that is, the case of under fitting, what we find is that both the cross validation error and the training error are going to be high. So, if your algorithm is suffering from a bias problem, the training set error would be high and you may find that the cross validation error will also be high. It might be close, maybe just slightly higher then a training error. And so, if you see this combination, that's a sign that your algorithm may be suffering from high bias. In contrast; if your algorithm is suffering from high variance; then, if you look here, we'll notice that, J train, that is the training error, is going to be low. That is, you're fitting the training set very well. Whereas, your cross validation error, assuming that this say the squared error which we're trying to minimize. Whereas in contrast; your error on a cross validation set or your cross function like cross validation set, will be much bigger than your training set error. This double greater than sign, here, it means much bigger than, all right. So, it's much greater than to multiply great to great. So this is a double greater than sign, that is the map symbol for much greater than denoted by two greater than signs. And so if you see this combination, then what you find. And so if you see this combination of values, then that is a clue that your learning algorithm may be suffering from high variance and might be overfitting. And the key that distinguishes these two cases is if you have a high bias problem your training set error will also be high as your hypothesis just not fitting the training set well. And if you have a high variance problem, your training set error will usually be low, that is much lower than the cross validation error. So, hopefully that gives you a somewhat better understanding of the two problems of bias and variance. I still have a lot more to say about bias and variance in the next few videos. But what we will see later; is that by diagnosing, whether a learning algorithm may be suffering from high bias or a high variance. I'll show you even more details on how to do that in later videos. We'll see that by figuring out whether a learning algorithm may be suffering from high bias or a combination of both that that would give us much better guidance for what might be promising things to try in order to improve the performance of the learning algorithm.