We've seen a number of techniques, some in previous lectures and some we talked about in this lecture. We've seen the Naive Bayes classifier, fairly in detail. We've seen some Probabilistic Graphical Models like Bayesian Networks. We've seen linear regression in detail, this time, we also heard about logistic regression neural networks and support vector machines, at least as to what they are. And, let's look at the problem and see which kind of techniques one would need to consider depending on the nature of the problem. We classify the problem in terms of the kind of features it has, whether they are numerical or categorical, that means numbers or classes. And the target variable, which is what we are trying to predict. We might be predicting a value, then it becomes a prediction problem where the value is numerical. We might be predicting a class in which, in which case it's a classification problem which is part of learning theory. Techniques can be used interchangeably across these two different types of prediction based on the kinds of features, of course, some techniques are more applicable than others. So, in the most straightforward case, If we have numerical features and a numerical target we want to predict, The correlation is stable and fairly linear, we'd use linear regression. Now, when I say stable and fair, fairly linear, even in situations like this, One would still prefer to use linear regression rather than some complicated non-linear function. Because using high order functions we'd, we'd say squares or cubes and sines and cos's, will tend to over fit the data and will not generalize to situations which may come up, arise in the future. So, right now you might have a great fit to the training data, but it really doesn't work in practice. So, linear regression is preferred unless you have some real reason to not use linear techniques. Similarly, even if your futures are categorical and your target is numerical, you can still use linear regression but you have to code the features. So, if the feature, for example, takes five different values or eight different values, you replace that feature with eight categorical variables, binary ones taking zero and one depending on whether, which value, which category value that feature took. It's better to do with, with binary coding as opposed to say, numerical coding because there's no reason why red being coded as five, blue being coded as six, and green being coded as seven. There's no reason to believe that red and blue are closer than red and green. So, using five, six, and seven is misleading and can make the regression technique go haywire. So, using three different features, eight, zero, one, to figure out whether something is red, blue, or green is better than using numbers. When we have categorical variables and numerical target, Neural networks can also be used just like they can be used for normal linear regression as well. But, they've sort of waned in their popularity except for certain situations which we will talk about in the next section. Now, let's come to the case where we have unstable or severely non-linear situations, which might look something like this, as we have seen before. There is no way one can fit a straight line to this para, this parabolic curve. and therefore, it's better to use a neural network which has non-linear elements, multi-level and hidden layers. So, a more complicated function can be learned, at the same time, one is not pre-supposing that it's going to be a problem. Because that, that would be kind of counter-productive because one is sort of pre-supposing the nature of f, rather than letting a neural network with many different possibilities discovered.. Next, we come to the classification situations where the target variable is categorical. Of course, when we have categorical features and categorical targets, when we have seen how to use naive-based and other probabilistic graphical models. These days, SVM's or Support Vector Machines are also very popular for even classification. Of course, for catagorical variables, one does have to do feature coding to a certain extent. So, we, we, We do need to do the same trick that we did for categorical features in linear regression because SVM essentially requires numerical inputs. Of course, if you have numerical inputs and categorical classification, SVMs are perfect. They are designed especially for those situations where you have unstable and severely non-linear correlations, and that's what they essentially do well, very well. On the other hand, if you have fairly stable linear correlations and you do have a classification problem, Then rather than using linear regression, as we have seen, one should use a logistic regression where one is bumping up or bumping down the, the difference form the separating line using the logistic function. So, take a look at this table. It will guide you, definitely in the problem set or the homework assign, the programming assignment for prediction. But, in general also, it's something that you should learn something from. We have not covered many techniques yet, we've only taken a very few techniques. Further, we've only talked about classification prediction, optimization, control, we haven't talked about those and we won't have time to get into those in this course.