1 00:00:03,320 --> 00:00:05,850 There's been a lot of hype about neural networks. 2 00:00:05,850 --> 00:00:10,170 And perhaps some of that hype is justified, given how well they're working. 3 00:00:10,170 --> 00:00:11,220 But it turns out that so 4 00:00:11,220 --> 00:00:15,710 far, almost all the economic value created by neural networks has been through 5 00:00:15,710 --> 00:00:18,970 one type of machine learning, called supervised learning. 6 00:00:18,970 --> 00:00:22,120 Let's see what that means, and let's go over some examples. 7 00:00:22,120 --> 00:00:26,030 In supervised learning, you have some input x, and 8 00:00:26,030 --> 00:00:30,210 you want to learn a function mapping to some output y. 9 00:00:30,210 --> 00:00:34,890 So for example, just now we saw the housing price prediction application where 10 00:00:34,890 --> 00:00:40,850 you input some features of a home and try to output or estimate the price y. 11 00:00:40,850 --> 00:00:45,180 Here are some other examples that neural networks have been applied to very 12 00:00:45,180 --> 00:00:46,940 effectively. 13 00:00:46,940 --> 00:00:51,180 Possibly the single most lucrative application of deep learning today is 14 00:00:51,180 --> 00:00:56,150 online advertising, maybe not the most inspiring, but certainly very lucrative, 15 00:00:56,150 --> 00:01:02,770 in which, by inputting information about an ad to the website it's thinking 16 00:01:02,770 --> 00:01:07,020 of showing you, and some information about the user, neural networks have 17 00:01:07,020 --> 00:01:10,700 gotten very good at predicting whether or not you click on an ad. 18 00:01:10,700 --> 00:01:11,770 And by showing you and 19 00:01:11,770 --> 00:01:15,800 showing users the ads that you are most likely to click on, this has been 20 00:01:15,800 --> 00:01:20,830 an incredibly lucrative application of neural networks at multiple companies. 21 00:01:20,830 --> 00:01:24,040 Because the ability to show you ads that you're more likely to 22 00:01:24,040 --> 00:01:26,690 click on has a direct impact on the bottom 23 00:01:26,690 --> 00:01:29,200 line of some of the very large online advertising companies. 24 00:01:30,630 --> 00:01:35,150 Computer vision has also made huge strides in the last several years, 25 00:01:35,150 --> 00:01:37,050 mostly due to deep learning. 26 00:01:37,050 --> 00:01:41,140 So you might input an image and want to output an index, 27 00:01:41,140 --> 00:01:45,290 say from 1 to 1,000 trying to tell you if this picture, 28 00:01:45,290 --> 00:01:47,300 it might be any one of, say a 1000 different images. 29 00:01:47,300 --> 00:01:50,500 So, you might us that for photo tagging. 30 00:01:50,500 --> 00:01:54,520 I think the recent progress in speech recognition has also been very exciting, 31 00:01:54,520 --> 00:01:57,910 where you can now input an audio clip to a neural network, and 32 00:01:57,910 --> 00:02:00,930 have it output a text transcript. 33 00:02:00,930 --> 00:02:05,400 Machine translation has also made huge strides thanks to deep learning where now 34 00:02:05,400 --> 00:02:09,400 you can have a neural network input an English sentence and directly output say, 35 00:02:09,400 --> 00:02:11,010 a Chinese sentence. 36 00:02:11,010 --> 00:02:15,930 And in autonomous driving, you might input an image, say a picture of what's in 37 00:02:15,930 --> 00:02:20,600 front of your car as well as some information from a radar, and 38 00:02:20,600 --> 00:02:25,080 based on that, maybe a neural network can be trained to tell you the position 39 00:02:25,080 --> 00:02:26,100 of the other cars on the road. 40 00:02:26,100 --> 00:02:30,870 So this becomes a key component in autonomous driving systems. 41 00:02:30,870 --> 00:02:35,730 So a lot of the value creation through neural networks has been through cleverly 42 00:02:35,730 --> 00:02:39,360 selecting what should be x and what should be y for 43 00:02:39,360 --> 00:02:45,000 your particular problem, and then fitting this supervised learning component into 44 00:02:45,000 --> 00:02:48,660 often a bigger system such as an autonomous vehicle. 45 00:02:48,660 --> 00:02:52,880 It turns out that slightly different types of neural networks are useful for 46 00:02:52,880 --> 00:02:54,960 different applications. 47 00:02:54,960 --> 00:03:00,100 For example, in the real estate application that we saw in the previous 48 00:03:00,100 --> 00:03:04,520 video, we use a universally standard neural network architecture, right? 49 00:03:04,520 --> 00:03:08,510 Maybe for real estate and online advertising might be a relatively 50 00:03:08,510 --> 00:03:11,620 standard neural network, like the one that we saw. 51 00:03:13,410 --> 00:03:19,120 For image applications we'll often use convolution on neural networks, 52 00:03:19,120 --> 00:03:20,680 often abbreviated CNN. 53 00:03:21,730 --> 00:03:24,000 And for sequence data. 54 00:03:24,000 --> 00:03:27,840 So for example, audio has a temporal component, right? 55 00:03:27,840 --> 00:03:32,990 Audio is played out over time, so audio is most naturally represented 56 00:03:32,990 --> 00:03:38,110 as a one-dimensional time series or as a one-dimensional temporal sequence. 57 00:03:38,110 --> 00:03:42,420 And so for sequence data, you often use an RNN, 58 00:03:42,420 --> 00:03:45,810 a recurrent neural network. 59 00:03:45,810 --> 00:03:50,270 Language, English and Chinese, the alphabets or the words come one at a time. 60 00:03:50,270 --> 00:03:54,820 So language is also most naturally represented as sequence data. 61 00:03:54,820 --> 00:04:00,700 And so more complex versions of RNNs are often used for these applications. 62 00:04:00,700 --> 00:04:04,360 And then, for more complex applications, like autonomous driving, where you have 63 00:04:04,360 --> 00:04:09,200 an image, that might suggest more of a CNN convolution neural network structure and 64 00:04:09,200 --> 00:04:12,480 radar info which is something quite different. 65 00:04:12,480 --> 00:04:15,360 You might end up with a more custom, or 66 00:04:15,360 --> 00:04:19,880 some more complex, hybrid neural network architecture. 67 00:04:20,880 --> 00:04:26,100 So, just to be a bit more concrete about what are the standard CNN and 68 00:04:26,100 --> 00:04:27,950 RNN architectures. 69 00:04:27,950 --> 00:04:32,790 So in the literature you might have seen pictures like this. 70 00:04:32,790 --> 00:04:34,740 So that's a standard neural net. 71 00:04:34,740 --> 00:04:36,800 You might have seen pictures like this. 72 00:04:36,800 --> 00:04:41,830 Well this is an example of a Convolutional Neural Network, and we'll see in 73 00:04:41,830 --> 00:04:45,950 a later course exactly what this picture means and how can you implement this. 74 00:04:45,950 --> 00:04:51,560 But convolutional networks are often use for image data. 75 00:04:51,560 --> 00:04:54,100 And you might also have seen pictures like this. 76 00:04:54,100 --> 00:04:57,590 And you'll learn how to implement this in a later course. 77 00:04:57,590 --> 00:05:00,180 Recurrent neural networks are very good for 78 00:05:00,180 --> 00:05:06,220 this type of one-dimensional sequence data that has maybe a temporal component. 79 00:05:06,220 --> 00:05:10,310 You might also have heard about applications of machine learning 80 00:05:10,310 --> 00:05:14,000 to both Structured Data and Unstructured Data. 81 00:05:14,000 --> 00:05:14,960 Here's what the terms mean. 82 00:05:14,960 --> 00:05:18,620 Structured Data means basically databases of data. 83 00:05:19,910 --> 00:05:25,010 So, for example, in housing price prediction, you might have a database or 84 00:05:25,010 --> 00:05:28,140 the column that tells you the size and the number of bedrooms. 85 00:05:28,140 --> 00:05:33,460 So, this is structured data, or in predicting whether or not a user will 86 00:05:33,460 --> 00:05:37,330 click on an ad, you might have information about the user, such as the age, 87 00:05:37,330 --> 00:05:41,590 some information about the ad, and then labels why that you're trying to predict. 88 00:05:41,590 --> 00:05:46,470 So that's structured data, meaning that each of the features, 89 00:05:46,470 --> 00:05:49,740 such as size of the house, the number of bedrooms, or 90 00:05:49,740 --> 00:05:54,530 the age of a user, has a very well defined meaning. 91 00:05:54,530 --> 00:06:00,520 In contrast, unstructured data refers to things like audio, raw audio, 92 00:06:00,520 --> 00:06:05,790 or images where you might want to recognize what's in the image or text. 93 00:06:05,790 --> 00:06:09,230 Here the features might be the pixel values in an image or 94 00:06:09,230 --> 00:06:12,190 the individual words in a piece of text. 95 00:06:12,190 --> 00:06:14,330 Historically, it has been much harder for 96 00:06:14,330 --> 00:06:19,480 computers to make sense of unstructured data compared to structured data. 97 00:06:19,480 --> 00:06:24,270 And the fact the human race has evolved to be very good at understanding 98 00:06:24,270 --> 00:06:26,270 audio cues as well as images. 99 00:06:26,270 --> 00:06:28,390 And then text was a more recent invention, but 100 00:06:28,390 --> 00:06:31,760 people are just really good at interpreting unstructured data. 101 00:06:31,760 --> 00:06:36,800 And so one of the most exciting things about the rise of neural networks is that, 102 00:06:36,800 --> 00:06:41,280 thanks to deep learning, thanks to neural networks, computers are now much better 103 00:06:41,280 --> 00:06:46,320 at interpreting unstructured data as well compared to just a few years ago. 104 00:06:46,320 --> 00:06:51,240 And this creates opportunities for many new exciting applications that use 105 00:06:51,240 --> 00:06:55,220 speech recognition, image recognition, natural language processing on text, 106 00:06:56,230 --> 00:07:00,180 much more than was possible even just two or three years ago. 107 00:07:00,180 --> 00:07:03,940 I think because people have a natural empathy to understanding unstructured 108 00:07:03,940 --> 00:07:08,250 data, you might hear about neural network successes on unstructured data 109 00:07:08,250 --> 00:07:13,060 more in the media because it's just cool when the neural network recognizes a cat. 110 00:07:13,060 --> 00:07:15,750 We all like that, and we all know what that means. 111 00:07:15,750 --> 00:07:19,290 But it turns out that a lot of short term economic value that neural 112 00:07:19,290 --> 00:07:24,270 networks are creating has also been on structured data, 113 00:07:24,270 --> 00:07:28,690 such as much better advertising systems, much better profit recommendations, and 114 00:07:28,690 --> 00:07:33,730 just a much better ability to process the giant databases that 115 00:07:33,730 --> 00:07:37,290 many companies have to make accurate predictions from them. 116 00:07:37,290 --> 00:07:41,230 So in this course, a lot of the techniques we'll go over will apply 117 00:07:41,230 --> 00:07:44,690 to both structured data and to unstructured data. 118 00:07:44,690 --> 00:07:46,970 For the purposes of explaining the algorithms, 119 00:07:46,970 --> 00:07:52,210 we will draw a little bit more on examples that use unstructured data. 120 00:07:52,210 --> 00:07:56,280 But as you think through applications of neural networks within your own team I 121 00:07:56,280 --> 00:08:01,360 hope you find both uses for them in both structured and unstructured data. 122 00:08:02,590 --> 00:08:06,390 So neural networks have transformed supervised learning and 123 00:08:06,390 --> 00:08:09,500 are creating tremendous economic value. 124 00:08:09,500 --> 00:08:12,910 It turns out though, that the basic technical ideas behind neural networks 125 00:08:12,910 --> 00:08:16,520 have mostly been around, sometimes for many decades. 126 00:08:16,520 --> 00:08:20,980 So why is it, then, that they're only just now taking off and working so well? 127 00:08:20,980 --> 00:08:24,970 In the next video, we'll talk about why it's only quite recently 128 00:08:24,970 --> 00:08:28,940 that neural networks have become this incredibly powerful tool that you can use.