You now know about linear regression and gradient descent. The plan from here on out is to tell you about a couple of important extensions of these ideas. Concretely here they are. First it turns out that in order to solve this minimization problem, turns out there's an algorithm for solving for theta zero and theta one exactly without needing an iterative algorithm. Without needing this algorithm like gradient descent that we had to iterate, you know, multiple times over. So it turns out there are advantages and disadvantages of this algorithm that lets you just solve for theta zero and theta one, basically in just one shot. One advantage is that there is no longer a learning rate alpha that you need to worry about and set. And so it can be much faster for some problems. We'll talk about its advantages and disadvantages later. Second, we'll also talk about algorithms for learning with a larger number of features. So, so far we've been learning with just one feature, the size of the house and using that to predict the price, so we're trying to take x and use that to predict y. But for other learning problems we may have a larger number of features. So for example let's say that you know not only the size, but also the number of bedrooms, the number of floors, and the age of these houses. And you want to use that to predict the price of the houses. In that case maybe we'll call these features x1, x2, x3, and x4. So now we have, you know, four features. We want to use these four features to predict why the price of the house. It turns out with all of these features, four of them in this case, it turns out that with multiple features it becomes harder to plot or visualize the data. So for example here we try to plot this type of data set. Maybe we will have the vertical axis be the price and maybe we can have one axis here, and another one here where this axis is the size of the house, and that axis is the number of bedrooms. You know, but this is just plotting, right my first two features: size and number of bedrooms. And when we have these additional features I don't know, I just don't know how to plot all of these data, right cuz I need like a 4-dimensional or 5-dimensional figure. I don't really know how to plot you know something more than like a 3-dimensional figure, like, like what I have over here. Also as you can tell, the notation starts to get a little more complicated, right. So rather than just having x our features we now have x1 through x4. And we're using these subscripts to denote my four different features. It turns out the best notation to keep all of this straight and to understand what's going on with the data even when we don't quite know how to plot it. It turns out that the best notation is the notation of linear algebra. Linear algebra gives us a notation and a set of things or a set of operations that we can do with matrices and vectors. For example. Here's a matrix where the columns of this matrix are: The first column is the sizes of the four houses, the second column was the number of bedrooms, that's the number of floors and that was the age of the home. And so a matrix is a block of numbers that lets me take all of my data, all of my x's. All of my features and organize them efficiently into sort of one big block of numbers like that. And here is what we call a vector in linear algebra where the four numbers here are the prices of the four houses that we saw on the previous slide. So. In the next set of videos what I'm going to do is do a quick review of linear algebra. If you haven't seen matrices and vectors before, so if all of this, everything on this slide is brand new to you or if you've seen linear algebra before, but it's been a while so you aren't completely familiar with it anymore, then please watch the next set of videos. And I'll quickly review the linear algebra you need in order to implement and use the more powerful versions of linear regression. It turns out linear algebra isn't just useful for linear regression models but these ideas of matrices and vectors will be useful for helping us to implement and actually get computationally efficient implementations for many later machines learning models as well. And as you can tell these sorts of matrices and vectors will give us an efficient way to start to organize large amounts of data, when we work with larger training sets. So, in case, in case you're not familiar with linear algebra or in case linear algebra seems like a complicated, scary concept for those of you who've never seen it before, don't worry about it. It turns out in order to implement machine learning algorithms we need only the very, very basics of linear algebra and you'll be able to very quickly pick up everything you need to know in the next few videos. Concretely, to decide if you should watch the next set of videos, here are the topics I'm going to cover. Talk about what are matrices and vectors. Talk about how to add, subtract, multiply matrices and vectors. Talk about the ideas of matrix inverses and transposes. And so, if you are not sure if you should watch the next set of videos take a look at these two things. So if you think you know how to compute this quantity, this matrix transpose times another matrix. If you think you know, if you have seen this stuff before, if you know how to compute the inverse of matrix times a vector, minus a number, times another vector. If these two things look completely familiar to you then you can safely skip the optional set of videos on linear algebra. But if these, concepts, if you're slightly uncertain what these blocks of numbers or these matrices of numbers mean, then please take a look of the next set of videos and, it'll very quickly teach you what you need to know about linear algebra in order to program machine learning algorithms and deal with large amounts of data.