1 00:00:00,000 --> 00:00:04,544 [MUSIC] 2 00:00:04,544 --> 00:00:07,212 Now that we've discussed what this course is going to be about, 3 00:00:07,212 --> 00:00:10,498 let's spend some time discussing what we're going to assume your background 4 00:00:10,498 --> 00:00:13,470 knowledge is to successfully complete this course. 5 00:00:13,470 --> 00:00:16,537 Well, there are a number of concepts we're going to assume that you've seen in 6 00:00:16,537 --> 00:00:18,435 the previous courses in this specialization. 7 00:00:18,435 --> 00:00:22,575 So, for example, in the foundations course, we went through a lot of the high 8 00:00:22,575 --> 00:00:26,784 level concepts that we're going to present in this specific about clustering and 9 00:00:26,784 --> 00:00:27,500 retrieval. 10 00:00:27,500 --> 00:00:31,095 But also, about how we think about machine learning in terms of what are inputs and 11 00:00:31,095 --> 00:00:33,192 what are outputs in machine learning methods? 12 00:00:33,192 --> 00:00:37,394 And how do we think about analyzing these outputs? 13 00:00:37,394 --> 00:00:40,043 And we also built up a lot of programming and 14 00:00:40,043 --> 00:00:44,340 data manipulation skills in this foundations course. 15 00:00:44,340 --> 00:00:48,158 And then in the regression course we went through a lot of detail on how we think 16 00:00:48,158 --> 00:00:51,181 about features representing the inputs to our algorithms. 17 00:00:51,181 --> 00:00:54,785 And what the outputs are and how to examine those outputs and 18 00:00:54,785 --> 00:00:57,607 assess different things about performance. 19 00:00:57,607 --> 00:01:01,111 And we talked about basical statistical concepts like mean and 20 00:01:01,111 --> 00:01:04,363 variance that we're going to see many times in this course. 21 00:01:04,363 --> 00:01:07,296 And also, we covered basic machine learning concepts, 22 00:01:07,296 --> 00:01:09,689 like, what is a machine learning algorithm? 23 00:01:09,689 --> 00:01:13,380 And then we went through specific optimization algorithms that we're 24 00:01:13,380 --> 00:01:16,341 going to elude to in this course, like coordinate ascent. 25 00:01:16,341 --> 00:01:21,272 And we talked about issues in terms of model complexity, like overfitting and 26 00:01:21,272 --> 00:01:24,122 how to cope with that using regularization. 27 00:01:24,122 --> 00:01:27,600 And these are all ideas that we're going to refer to throughout this course. 28 00:01:27,600 --> 00:01:30,900 And we're going to assume that you already know to some degree. 29 00:01:30,900 --> 00:01:34,772 Then, coming out of the classification course, there are a number of things that 30 00:01:34,772 --> 00:01:37,802 are really crucial to the content that we're going to cover here, 31 00:01:37,802 --> 00:01:40,554 like distributions and conditional distributions, and 32 00:01:40,554 --> 00:01:42,203 maximum likelihood estimation. 33 00:01:42,203 --> 00:01:46,089 And then there are a couple things that I'm going to refer to just to draw 34 00:01:46,089 --> 00:01:48,436 analogies, but aren't that critical, 35 00:01:48,436 --> 00:01:53,080 like ideas of linear classifiers, multiclass classification, and boosting. 36 00:01:54,140 --> 00:01:57,593 So if you haven't taken the past courses and some of these concepts, 37 00:01:57,593 --> 00:01:59,557 maybe you only have a fuzzy idea about, 38 00:01:59,557 --> 00:02:02,731 I'd strongly encourage you to go back to these other courses. 39 00:02:02,731 --> 00:02:05,260 And at least watch the videos related to these topics. 40 00:02:06,530 --> 00:02:11,019 Then, in terms of math background, there's not going to be much of an emphasis on 41 00:02:11,019 --> 00:02:13,769 calculus like there was in the past two courses. 42 00:02:13,769 --> 00:02:18,098 But we're still going to assume that you know some basic linear algebra, like, 43 00:02:18,098 --> 00:02:23,100 what is a vector, what is a matrix, how do you think about multiplying matrices? 44 00:02:23,100 --> 00:02:26,490 And I think that's basically it for the linear algebra part. 45 00:02:26,490 --> 00:02:30,000 But in this course we're also going to turn to some probabilistic concepts 46 00:02:30,000 --> 00:02:32,930 building off of things that you saw in the classification course. 47 00:02:32,930 --> 00:02:36,410 So we're going to assume that you know fundamental laws of probability, 48 00:02:36,410 --> 00:02:37,866 like probability sum to one. 49 00:02:37,866 --> 00:02:40,719 Or, they're bounded between zero and one. 50 00:02:40,719 --> 00:02:44,103 We're going to assume that you know, what is a distribution, and 51 00:02:44,103 --> 00:02:46,872 how do you think about a conditional distribution? 52 00:02:46,872 --> 00:02:49,313 And we're going to walk through these concepts and 53 00:02:49,313 --> 00:02:52,897 teach them to you at the level that you need to know them for this course. 54 00:02:52,897 --> 00:02:55,950 But it's definitely going to be helpful if you have some of that background 55 00:02:55,950 --> 00:02:56,460 coming in. 56 00:02:58,010 --> 00:03:01,020 And in terms of programming experience, we've tried to make this course as 57 00:03:01,020 --> 00:03:03,720 open as possible to people having different preferences for 58 00:03:03,720 --> 00:03:05,580 different programming languages. 59 00:03:05,580 --> 00:03:11,240 But we're going to encourage people to use Python although it's not required. 60 00:03:11,240 --> 00:03:12,050 For example, 61 00:03:12,050 --> 00:03:15,620 for all of the assignments we're going to provide starter code in Python. 62 00:03:15,620 --> 00:03:18,830 So of course, if you're a Python user, it's going to be easier for 63 00:03:18,830 --> 00:03:20,980 you to complete the assignments. 64 00:03:20,980 --> 00:03:23,950 But I want to emphasize that the point of everything we're teaching in 65 00:03:23,950 --> 00:03:28,180 this course is to teach you fundamental machine learning concepts and 66 00:03:28,180 --> 00:03:30,410 not specific implementation details. 67 00:03:30,410 --> 00:03:30,980 Though, of course, 68 00:03:30,980 --> 00:03:35,700 we want you to get hands on experience with implementing the methods. 69 00:03:35,700 --> 00:03:38,410 But I want to say that if you completed the foundations course, and 70 00:03:38,410 --> 00:03:41,380 hopefully the regression and classification courses as well, you should 71 00:03:41,380 --> 00:03:46,540 be set for the skill level needed to complete the assignmentss in this course. 72 00:03:46,540 --> 00:03:50,090 And if we think back to the foundations course, we relied very heavily 73 00:03:50,090 --> 00:03:53,445 on pre-implemented algorithms like those available on GraphLab Create. 74 00:03:53,445 --> 00:03:56,962 Because the point of that course was just to understand the input and 75 00:03:56,962 --> 00:03:59,130 the output of machine learning algorithms. 76 00:03:59,130 --> 00:04:02,605 And we didn't get it to the details of what's under the hood in each 77 00:04:02,605 --> 00:04:03,980 one of these algorithms. 78 00:04:03,980 --> 00:04:04,966 But this course, 79 00:04:04,966 --> 00:04:08,843 we're going to take a deep dive into the algorithm in details, so much so 80 00:04:08,843 --> 00:04:13,720 that you should be able to implement them in any language of your choice. 81 00:04:13,720 --> 00:04:17,000 And you're going to get practical experience with doing this in this course. 82 00:04:17,000 --> 00:04:20,220 We're going to ask you to actually implement these methods, for 83 00:04:20,220 --> 00:04:22,870 everything except in the last module, 84 00:04:22,870 --> 00:04:27,690 where we go through the more advanced concept of latent Dirichlet allocation. 85 00:04:27,690 --> 00:04:31,500 Where actually we do teach you everything you need to know to implement the methods, 86 00:04:31,500 --> 00:04:35,030 but we're not actually going to ask you to do so in the assignment. 87 00:04:35,030 --> 00:04:38,040 So, to describe the assignments a little bit more, we're going to follow the same 88 00:04:38,040 --> 00:04:41,430 structure that we did in the regression and classification courses. 89 00:04:41,430 --> 00:04:44,840 Where in general, the structure is going to to be that, first we're going to 90 00:04:44,840 --> 00:04:50,370 go through an exploration of the methods using a preimplemented algorithm so 91 00:04:50,370 --> 00:04:53,220 that we can understand the methods in more detail. 92 00:04:53,220 --> 00:04:56,650 And solidify the concepts without getting bogged 93 00:04:56,650 --> 00:05:02,110 down in specific implementation details or potential issues with bugs in our code. 94 00:05:02,110 --> 00:05:05,660 But once we've solidified these concepts, then we're going to turn to 95 00:05:05,660 --> 00:05:07,830 actual implementations that you're all going to write. 96 00:05:09,120 --> 00:05:12,160 And finally, the computing needs are the same as in 97 00:05:12,160 --> 00:05:16,500 other courses of this specialization, where there are a couple of choices. 98 00:05:16,500 --> 00:05:18,490 One is that you have your own machine. 99 00:05:18,490 --> 00:05:22,620 And if you're going to use SFrames, which we encourage you to do to do 100 00:05:22,620 --> 00:05:27,700 the different data manipulations, then you need to have a 64-bit machine. 101 00:05:27,700 --> 00:05:31,778 Other than that, it can be a fairly basic desktop or laptop that you have. 102 00:05:31,778 --> 00:05:34,810 You're also ,of course, going to need access to the Internet so 103 00:05:34,810 --> 00:05:38,900 that you can watch these lovely videos, as well as download the assignments and 104 00:05:38,900 --> 00:05:42,440 upload your implementations to the Coursera interface. 105 00:05:42,440 --> 00:05:45,250 And you're also going to need the ability to install and 106 00:05:45,250 --> 00:05:49,150 run Python and to store a few gigabytes of data. 107 00:05:49,150 --> 00:05:52,660 But, as an alternative, we're also going to provide some preconfigured 108 00:05:52,660 --> 00:05:57,550 machines in the cloud, so that if you don't have a 64-bit machine of your own, 109 00:05:57,550 --> 00:06:00,060 you can still complete this course. 110 00:06:00,060 --> 00:06:03,739 Okay, so now that we've gone through what this course is about and 111 00:06:03,739 --> 00:06:08,091 everything you're going to need to complete this course, let's get started. 112 00:06:08,091 --> 00:06:12,079 [MUSIC]