1 00:00:00,102 --> 00:00:04,557 [MUSIC] 2 00:00:04,557 --> 00:00:08,370 >> Okay, let's talk about this in the context of the bias variance trade-off. 3 00:00:08,370 --> 00:00:11,800 And what we saw is when we had very large lambda, 4 00:00:11,800 --> 00:00:17,320 we had a solution with very high bias, but low variance. 5 00:00:17,320 --> 00:00:20,900 And one way to see this is that, is thinking about when we're cranking lambda 6 00:00:20,900 --> 00:00:25,630 all the way up to infinity, in that limit, we get coefficients shrunk to be zero, 7 00:00:25,630 --> 00:00:29,110 and clearly that's a model with high bias but low variance. 8 00:00:29,110 --> 00:00:32,610 It's completely low variance, it doesn't change no matter what data you give me. 9 00:00:34,270 --> 00:00:37,910 On the other hand, when we had very small lambda, 10 00:00:37,910 --> 00:00:42,070 we have a model that is low bias, but high variance. 11 00:00:42,070 --> 00:00:46,400 And to see this think about setting lambda to zero, in which case, we get out just 12 00:00:46,400 --> 00:00:53,230 our old solution, our old lee squares or minimizing residual sum of squares fit. 13 00:00:53,230 --> 00:00:55,990 And there we see that for 14 00:00:55,990 --> 00:00:59,670 higher complexity models clearly you're gonna have low bias but high variance. 15 00:01:01,600 --> 00:01:06,180 So what we see is this lambda tuning parameter controls our model 16 00:01:06,180 --> 00:01:10,500 complexity and controls this bias variance trade-off. 17 00:01:10,500 --> 00:01:13,740 Okay, so let's return to our polynomial regression demo, but 18 00:01:13,740 --> 00:01:18,435 now using ridge regression and see if we can ameliorate the issues of 19 00:01:18,435 --> 00:01:22,220 over-fitting as we vary the choice of lambda. 20 00:01:22,220 --> 00:01:25,962 And so we're going to explore this ridge regression solution for 21 00:01:25,962 --> 00:01:29,511 a couple different choices of this lambda tuning parameter. 22 00:01:29,511 --> 00:01:33,579 [MUSIC]