[MUSIC] Well, this all subsets algorithm might seem great or at least pretty straight forward to implement, but a question is what's the complexity of running all subsets? How many models did we have to evaluate? Well, clearly what we evaluated all models, but let's quantify what that is. So we looked at the model that was just noise. We looked at the model with just the first feature, second feature, all the way up to the model with just the first two features, the second two features and every possible model up to the full model of all D features. And what we can do is we can index each one of these models that we searched over by a feature vector. And this feature vector is going to say, so for feature one, feature two, all the way up to feature D, what we're gonna enter is zero if no, that feature is not in the model, and one if yes, that feature is in the model. So it's just gonna be a binary vector indicating which features are present. So, in the case of just noise, or no features we have zeros along this whole vector. In the case of just the first model, I made the first feature be included in the model, we're just gonna have a one in that first feature location, and zeros everywhere else. I guess for consistency, let me index this as feature zero, feature one. All the way up to feature D. Okay, and we're gonna go through this entire set of possible feature vectors, and how many choices are there for the first entry, two. How many for the second, two, two, two choices for every entry. And how many entries are there? Well, with my new indexing, instead of D there's really D plus one. That's just a little notational choice. And I did a little back of the envelope calculation for a couple choices of D. So for example, if we had a total of eight different features we were looking over, then we would have to search over 256 models. That actually might be okay. But if we had 30 features, all of a sudden we have to search over 1 billion some number of different models. And if we have 1,000 features, which really is not that many in applications we look at these days, all of a sudden we have to search over 1.07 times 10 to the 301. And for the example I gave with 100 billion features, I don't even know what that number is. Well, I'm sure I could go and compute it, but I didn't bother and it's clearly just huge. So, what we can see is that typically, and in most situations we're faced with these days. It's just computationally prohibitive to do this all subset search. [MUSIC]