Hi, everyone. Today, we will discuss this new method for visualizing data integrating features. At the end of this video, you will be able to use tSNE in your products. In the previous video, we learned about metaphysician technique that is predatory very close to linear models. In this video, we will touch the subject of non-linear methods of dimensionality reduction. That says in general are called manifold learning. For example, look at the data in form of letter S on the left side. On the right, we can see results of running different manifold learning algorithm on the data. This new result is placed at the right bottom corner on the slide. This new algorithm is the main topic of the lecture, as it tells of how this really works won't be explained here. But you will come to look at additional materials for the details. Let's just say that this is a method that tries to project points from high dimensional space into small dimensional space so that the distances between points are approximately preserved. Let's look at the example of the tSNE on the MNIST dataset. Here are points from 700 dimensional space that are projected into two dimensional space. You can see that such projection forms explicit clusters. Coolest shows that these clusters are meaningful and corresponds to the target numbers well. Moreover, neighbor clusters corresponds to a visually similar numbers. For example, cluster of three is located next to the cluster of five which in chance is adjustment to the cluster of six and eight. If data has explicit structure as in case of MNIST dataset, it's likely to be reflected on tSNE plot. For the reason tSNE is widely used in exploratory data analysis. However, do not assume that tSNE is a magic want that always helps. For example, a misfortune choice of hyperparameters may lead to poor results. Consider an example, in the center is the least presented a tSNE projection of exactly the same MNIST data as in previous example, only perplexity parameter has been changed. On the left, for comparison, we have plots from previous right. On the right, so it present a tSNE projection of random data. We can see as a choice of hybrid parameters change projection of MNIST data significantly so that we cannot see clusters. Moreover, new projection become more similar to random data rather than to the original. Let's find out what depends on the perplexity hyperparameter value. On the left, we have perplexity=3, in the center=10, and on the right= 150. I want to emphasize that these projections are all made for the same data. The illustration shows that these new results strongly depends on its parameters, and the interpretation of the results is not a simple task. In particular, one cannot infer the size of original clusters using the size of projected clusters. Similar proposition is valid for a distance between clusters. Blog distill.pub contain a post about how to understand and interpret the results of tSNE. Also, it contains a great interactive demo that will help you to get into issues of how tSNE works. I strongly advise you to take a look at it. In addition to exploratory data analysis, tSNE can be considered as a method to obtain new features from data. You should just concatenate the transformers coordinates to the original feature matrix. Now if you've heard this about practical details, as it has been shown earlier, the results of tSNE algorithm, it strongly depends on hyperparameters. It is good practice to use several projections with different perplexities. In addition, because of stochastic of this methods results in different projections even with the same data and hyperparameters. This means the train and test sets should be projected together rather than separately. Also, tSNE will run for a long time if you have a lot of features. If the number of features is greater than 500, you should use one of dimensionality reduction approach and reduce number of features, for example, to 100. Implementation of tSNE can be found in the sklearn library. But personally, I prefer to use another implementation from a separate Python package called tSNE, since it provide a way more efficient implementation. In conclusion, I want to remind you the basic points of the lecture. TSNE is an excellent tool for visualizing data. If data has an explicit structure, then it likely be [inaudible] on tSNE projection. However, it requires to be cautious with interpretation of tSNE results. Sometimes you can see structure where it does not exist or vice versa, see none where structure is actually present. It's a good practice to do several tSNE projections with different perplexities. And in addition to EJ, tSNE is working very well as a feature for feeding models. Thank you for your attention.