In this video, I want to talk about the normal equation and non-invertibility. This is a somewhat more advanced concept, but it is something that I've often been asked about. And so I wanted to talk about it here. But this is a somewhat more advanced concept, so feel free to consider this optional material There's a phenomenon that you may run into that's maybe for some of you useful to understand. But even if you don't understand it, the normal equation and linear regression, you should really get that to work okay. Here's the issue: For those of you that are maybe somewhat more familar with linear algebra, what some students have asked me is, when computing this theta equals ( Xtranspose X )inverse Xtranspose y what if the matrix Xtranspose X is non-invertible? So, for those of you that know a bit more linear algebra you may know that only some matrices are invertible and some matrices do not have an inverse we call those non-invertible matrices, singular or degenerate matrices. The issue or the problem of Xtranpose X being non-invertible should happen pretty rarely. And in Octave, if you implement this to compute theta, it turns out that this will actually do the right thing. I'm getting a little bit technical now and I don't want to go into details, but Octave has two functions for inverting matrices: One is called pinv(), and the other is called inv(). The differences between these two are somewhat technical. One's called the pseudo-inverse, one's called the inverse. You can show mathemically so as long as you use the pinv() function, then this will actually compute the value of theta that you want, even if Xtranspose X is non-invertible. The specific details between what is the difference between pinv() and what is inv() that is somewhat advanced numerical computing concepts, that I don't really want to get into. But I thought in this optional video I try to give you a little bit of intuition about what it means that Xtranspose X to be non-invertible. For those of you that know a bit more linear algebra and might be interested. I'm not going to proove this mathematically, but if Xtranspose X is non-invertible, there are usually two most common causes: The first cause is if somehow, in your learning problem, you have redundant features, concretely, if you try to predict housing prices and if x1 is the size of a house in square-feet, and x2 is the size of the house in square-meters, then, you know, 1 meter is equal to 3.28 feet, rounded to two decimals, and so your two features will always satisfy the constraint that x1 equals 3(.28)^2 times x2. And you can show, for those of you - this is somehwat advanced linear algebra now, but if you're an expert in linear algebra, you can actually show that if your two features are related via a linear equation like this, then matrix Xtranspose X will be non-invertible. The second thing that can cause Xtranspose X to be non-invertible is if you're trying to run a learning algorithm with a lot of a features. Concretely, if m is less than or equal to n. For example, if you imagine that you have m equals 10 training examples and that you have n equals 100 features, then you're trying to fit a parameter vector theta, which is (n+1)-dimensional, so it's a 101-dimensional you're trying to fit a 101 parameters from just 10 training examples. And this turns out to sometimes work, but to not always be a good idea. Because, as we see later, you might not have enough data if you only have 10 examples to fit 100 or 101 parameters. We'll see later in this course, why this might be too little data to fit this many parameters. But commonly, what we do then if m is less than n, is to see if we can either delete some features or to use a technique called regularization, which is something that we will talk about a bit later in this course as well, that will kind of let you fit a lot of parameters using a lot of features even if you have a relatively small training set. But this regularization will be a later topic in this course. But to summarize, if ever you find that Xtranspose X is singular or alternatively find is non-invertible, what I would recommend you do is first: look at your features and see if you have redundant features like these x1 and x2 being linearly dependent, or being a linear function of each other, like so and if you do have redundant features and if you just delete one of these features - you really don't need both of these features, so if you just delete one of these features that will solve your non-invertibility problem and, so first think through my features and check if any are redundant and if so, then, you know, keep deleting the redundant features until they are no longer redundant. And if your features are non redundant, I would check if I might have too many features, and if that's the case I would either delete some features if I can bare to use fewer features, or else I would consider using regularization, which is this topic that we will talk about later. So, that's it for the normal equation and what it means if the matrix Xtranspose X is non-invertible. But this is a problem that hopefully you run into pretty rarely. And if you just implement it in Octave using the pinv() function which is called the pseudo-inverse function so you use a different linear algebra library, that is called pseudo-inverse but that implementation should just do the right thing even if Xtranspose X is non-invertible which should happen pretty rarily anyway so this should not be a problem for most implementations of linear regression.