1 00:00:02,930 --> 00:00:06,605 Hi everyone. I am excited to see you on board, 2 00:00:06,605 --> 00:00:08,510 and welcome to our course. 3 00:00:08,510 --> 00:00:13,070 I want to start our lesson with the informal discussion of who we are, 4 00:00:13,070 --> 00:00:15,225 and who is this course for. 5 00:00:15,225 --> 00:00:20,025 Then, we will have a brief introduction to the area of Natural Language Processing. 6 00:00:20,025 --> 00:00:25,100 You know, it might feel a little hand wavy as any introduction actually, 7 00:00:25,100 --> 00:00:27,515 but I hope that after our course, 8 00:00:27,515 --> 00:00:32,585 you will know exactly everything that will be mentioned in our lesson now. 9 00:00:32,585 --> 00:00:35,555 So, ready? Let us get started. 10 00:00:35,555 --> 00:00:40,555 My name is Anna, and we have a big nice team creating the course for you. 11 00:00:40,555 --> 00:00:43,120 So, we have Sergey, Alexey, Andrey, 12 00:00:43,120 --> 00:00:46,535 and one more Anna preparing the materials. 13 00:00:46,535 --> 00:00:50,795 I have a background on computer science and machine learning, 14 00:00:50,795 --> 00:00:56,410 and I'm now applying this background in natural language processing in different ways. 15 00:00:56,410 --> 00:01:00,160 And you know these different activities like research, teaching, 16 00:01:00,160 --> 00:01:04,700 and industry, you've had different perspectives to the same area. 17 00:01:04,700 --> 00:01:09,230 So, for example, when you come to the industry very soon, 18 00:01:09,230 --> 00:01:15,460 you realize that not a new paper from academia is useful in the particular settings, 19 00:01:15,460 --> 00:01:22,430 like large scale implementation or some noisy dater or specific needs of your business. 20 00:01:22,430 --> 00:01:24,625 So, probably, you need to build 21 00:01:24,625 --> 00:01:30,410 some more simple solution but that would work nicely in your specific settings. 22 00:01:30,410 --> 00:01:34,525 Okay. Now, who is this course for? 23 00:01:34,525 --> 00:01:40,340 When I was thinking about what would be one word to characterize our audience, 24 00:01:40,340 --> 00:01:43,435 I thought that it would be the word curious. 25 00:01:43,435 --> 00:01:50,155 So, this course is for curious people who want to know what is inside some applications. 26 00:01:50,155 --> 00:01:53,950 For example, you have differently used machine translation. 27 00:01:53,950 --> 00:01:56,725 Do you know how it works or 28 00:01:56,725 --> 00:02:01,295 dialogue agents that are so popular nowadays? What is inside there? 29 00:02:01,295 --> 00:02:06,860 And you know, this popularity of certain applications is couldn't bet. 30 00:02:06,860 --> 00:02:09,710 So, for example, for dialogue agents, 31 00:02:09,710 --> 00:02:13,495 we have so much hype around so that it is not 32 00:02:13,495 --> 00:02:17,965 that easy to distinguish what is just some beautiful words, 33 00:02:17,965 --> 00:02:21,930 and what is something that will really work in practice. 34 00:02:21,930 --> 00:02:27,215 So, hopefully, one outcome of our course for you would be 35 00:02:27,215 --> 00:02:32,920 the ability to distinguish between the hype and something that really works. 36 00:02:32,920 --> 00:02:36,775 Now, our course is rather in-depth. 37 00:02:36,775 --> 00:02:41,745 So, I want to go with some details through several methods in NLP 38 00:02:41,745 --> 00:02:47,630 because these will give you the ability to distinguish the hype from the methods. 39 00:02:47,630 --> 00:02:50,210 Okay? Also, we will cover 40 00:02:50,210 --> 00:02:54,550 real state-of-the-art approaches both in research and production. 41 00:02:54,550 --> 00:02:56,505 And as I have already said, 42 00:02:56,505 --> 00:02:59,365 this could be rather different approaches. 43 00:02:59,365 --> 00:03:03,410 Now, another goal that's a little bit contradict to 44 00:03:03,410 --> 00:03:08,250 going in-depth would be to have a big picture of the area. 45 00:03:08,250 --> 00:03:13,125 So, I feel like it is really important to have some expertise like, 46 00:03:13,125 --> 00:03:16,745 I am given a task, what should I do with it? 47 00:03:16,745 --> 00:03:20,290 What approaches would work in this certain case? 48 00:03:20,290 --> 00:03:22,285 To have this intuition, 49 00:03:22,285 --> 00:03:27,310 we will try to discuss as many different settings and tasks as possible, 50 00:03:27,310 --> 00:03:30,005 and cover some approaches for them. 51 00:03:30,005 --> 00:03:36,365 And obviously, we should not only talk and you should not only listen and read about it, 52 00:03:36,365 --> 00:03:40,710 but you have to do some practice to get a hands-on experience. 53 00:03:40,710 --> 00:03:45,000 So, we are preparing materials for you for home assignments in 54 00:03:45,000 --> 00:03:50,795 Python for some popular NLP tasks like text classification, 55 00:03:50,795 --> 00:03:54,445 or duplicate detection, named entity recognition, 56 00:03:54,445 --> 00:03:59,355 and some others, so that you have some experience with your own hands. 57 00:03:59,355 --> 00:04:02,985 Also, this home tasks will help you to build 58 00:04:02,985 --> 00:04:08,030 the project of our course that would be a conversational chat-bot. 59 00:04:08,030 --> 00:04:15,525 So now, I feel like it is really important also to see what is our course not about, 60 00:04:15,525 --> 00:04:21,450 because NLP is so big that obviously our course cannot feed everyone's needs. 61 00:04:21,450 --> 00:04:25,550 So, I feel like if you only want to know 62 00:04:25,550 --> 00:04:31,085 some black box implementations and stock them together to build some solution, 63 00:04:31,085 --> 00:04:34,210 then probably, this course is not for you. 64 00:04:34,210 --> 00:04:38,550 Also, I think that it is a good idea to take machine learning and 65 00:04:38,550 --> 00:04:44,075 deep learning courses first to fill it is with some names and formulas. 66 00:04:44,075 --> 00:04:47,420 For example here, I have a quick test for you. 67 00:04:47,420 --> 00:04:50,450 Do you know what is Recurrent Neural Networks? 68 00:04:50,450 --> 00:04:53,910 Or have you heard about likelihood maximization? 69 00:04:53,910 --> 00:04:57,500 Just take a moment to see how comfortable you are with 70 00:04:57,500 --> 00:05:01,190 these words and see whether you need to take, 71 00:05:01,190 --> 00:05:03,070 for example, deep learning course in 72 00:05:03,070 --> 00:05:07,470 our specialization first before going to this course. 73 00:05:07,470 --> 00:05:11,420 Also, we expect that you have some experience with Python. 74 00:05:11,420 --> 00:05:14,910 Probably, you don't have any experience with TensorFlow, 75 00:05:14,910 --> 00:05:16,600 and this is maybe okay, 76 00:05:16,600 --> 00:05:22,495 and then this is a good moment for you to try to go through some tutorials, 77 00:05:22,495 --> 00:05:25,965 and this course could be a good reason to go through them. 78 00:05:25,965 --> 00:05:28,975 Actually, TensorFlow has really nice tutorials, 79 00:05:28,975 --> 00:05:31,940 so I think that it shouldn't be a problem for you. 80 00:05:31,940 --> 00:05:34,385 I hope you are still not frightened. 81 00:05:34,385 --> 00:05:38,700 And I hope you are ready for our journey to the NLP. 82 00:05:38,700 --> 00:05:44,690 And I want to start this journey with the survey of the main approaches.