1 00:00:01,040 --> 00:00:05,360 You've now learned about several highly effective neural network and 2 00:00:05,360 --> 00:00:07,750 ConvNet architectures. 3 00:00:07,750 --> 00:00:12,040 What I want to do in the next few videos is share with you some practical advice on 4 00:00:12,040 --> 00:00:17,056 how to use them, first starting with using open source implementations. 5 00:00:17,056 --> 00:00:21,870 It turns out that a lot of these neural networks are difficult or finicky 6 00:00:21,870 --> 00:00:26,660 to replicate because a lot of details about tuning of the hyperparameters such 7 00:00:26,660 --> 00:00:31,430 as learning decay and other things that make some difference to the performance. 8 00:00:31,430 --> 00:00:34,890 And so I've found that it's sometimes difficult even for, 9 00:00:34,890 --> 00:00:40,140 say, a higher deep loving PhD students, even at the top universities 10 00:00:40,140 --> 00:00:45,130 to replicate someone else's polished work just from reading their paper. 11 00:00:45,130 --> 00:00:47,860 Fortunately, a lot of deep learning researchers 12 00:00:47,860 --> 00:00:52,880 routinely open source their work on the Internet, such as on GitHub. 13 00:00:52,880 --> 00:00:55,680 And as you do work yourself, I certainly encourage you 14 00:00:55,680 --> 00:01:00,420 to consider contributing back your code to the open source community. 15 00:01:00,420 --> 00:01:04,930 But if you see a research paper whose results you would like to build on top of, 16 00:01:04,930 --> 00:01:06,500 one thing you should consider doing, 17 00:01:06,500 --> 00:01:11,770 one thing I do quite often it's just look online for an open source implementation. 18 00:01:11,770 --> 00:01:16,284 Because if you can get the author's implementation, you can usually get going 19 00:01:16,284 --> 00:01:20,000 much faster than if you would try to reimplement it from scratch. 20 00:01:20,000 --> 00:01:23,414 Although sometimes reimplementing from scratch could be a good exercise 21 00:01:23,414 --> 00:01:24,350 to do as well. 22 00:01:24,350 --> 00:01:27,800 If you're already familiar with how to use GitHub, 23 00:01:27,800 --> 00:01:32,080 this video might be less necessary or less important for you. 24 00:01:32,080 --> 00:01:35,960 But if you aren't used to downloading open-source code from GitHub, 25 00:01:35,960 --> 00:01:38,300 let me quickly show you how easy it is. 26 00:01:42,589 --> 00:01:46,270 Let's say you're excited about residual networks, and you want to use it. 27 00:01:46,270 --> 00:01:49,700 So let's search for residence on GitHub. 28 00:01:50,880 --> 00:01:55,870 And so you actually see a lot of different implementations of residence on GitHub. 29 00:01:55,870 --> 00:01:58,840 And I'm just going to go to the first URL here. 30 00:01:58,840 --> 00:02:02,760 And this is a GitHub repo that implements residence. 31 00:02:02,760 --> 00:02:06,346 Along with the GitHub webpages if you scroll down we'll have some text 32 00:02:06,346 --> 00:02:09,840 describing the work or the particular implementation. 33 00:02:09,840 --> 00:02:13,980 On this particular repo, this particular GitHub 34 00:02:13,980 --> 00:02:19,090 repository was actually by the original authors of the ResNet paper. 35 00:02:19,090 --> 00:02:22,940 And this code, this license under an MIT license, 36 00:02:22,940 --> 00:02:27,110 you can click through to take a look at the implications of this license. 37 00:02:27,110 --> 00:02:29,454 The MIT License is one of the more permissive or 38 00:02:29,454 --> 00:02:32,420 one of the more open open-source licenses. 39 00:02:32,420 --> 00:02:37,650 So I'm going to go ahead and download the code, and to do that, click on this link. 40 00:02:37,650 --> 00:02:41,327 This gives you the URL that you can use to download the code. 41 00:02:41,327 --> 00:02:45,455 I'm going to click on this button over here to copy the URL to my clipboard and 42 00:02:45,455 --> 00:02:46,527 then go over here. 43 00:02:46,527 --> 00:02:53,100 Then all you have to do is type git clone and then Ctrl+V for the URL and hit Enter. 44 00:02:53,100 --> 00:02:55,450 And so in a couples of seconds it has download, 45 00:02:55,450 --> 00:02:58,726 has cloned this repository to my local hard disk. 46 00:02:58,726 --> 00:03:03,290 So let's go into the directory and let's take a look. 47 00:03:03,290 --> 00:03:09,900 I'm more used in Mac than Windows, but I guess let's see, let's go to prototxt and 48 00:03:09,900 --> 00:03:15,450 I think this is where it has the files specifying the network. 49 00:03:15,450 --> 00:03:21,722 So let's take a look at this file, because this is a very long file that specifies 50 00:03:21,722 --> 00:03:28,030 the detail configurations of the ResNet with a 101 layers, all right? 51 00:03:28,030 --> 00:03:32,640 And it looks like from what I remember seeing from this webpage, 52 00:03:32,640 --> 00:03:36,830 this particular implementation uses the Cafe framework. 53 00:03:39,112 --> 00:03:42,516 But if you wanted implementation of this code using some other 54 00:03:42,516 --> 00:03:45,930 programming framework, you might be able to find it as well. 55 00:03:48,198 --> 00:03:51,752 So if you're developing a computer vision application, 56 00:03:51,752 --> 00:03:56,030 a very common workflow would be to pick an architecture that you like, 57 00:03:56,030 --> 00:03:59,405 maybe one of the ones you learned about in this course. 58 00:03:59,405 --> 00:04:03,415 Or maybe one that you heard about from a friend or from some literature. 59 00:04:03,415 --> 00:04:06,035 And look for an open source implementation and 60 00:04:06,035 --> 00:04:09,655 download it from GitHub to start building from there. 61 00:04:09,655 --> 00:04:14,300 One of the advantages of doing so also is that sometimes these networks take 62 00:04:14,300 --> 00:04:18,380 a long time to train, and someone else might have used multiple GPUs and 63 00:04:18,380 --> 00:04:22,110 a very large dataset to pretrain some of these networks. 64 00:04:22,110 --> 00:04:25,410 And that allows you to do transfer learning using these 65 00:04:25,410 --> 00:04:28,930 networks which we'll discuss in the next video as well. 66 00:04:28,930 --> 00:04:33,679 Of course if you're computer vision researcher implementing these things from 67 00:04:33,679 --> 00:04:36,623 scratch, then your workflow will be different. 68 00:04:36,623 --> 00:04:37,615 And if you do that, 69 00:04:37,615 --> 00:04:40,969 then do contribute your work back to the open source community. 70 00:04:40,969 --> 00:04:46,037 But because so many vision researchers have done so much work implementing these 71 00:04:46,037 --> 00:04:51,183 architectures, I found that often starting with open-source implementations 72 00:04:51,183 --> 00:04:55,820 is a better way, or certainly a faster way to get started on a new project.