In the previous video, you've already seen all the basic building blocks of the Inception network. In this video, let's see how you can put these building blocks together to build your own Inception network. So the Inception module takes as input the activation or the output from some previous layer. So let's say for the sake of argument, this is 28 by 28 by 192, same as our previous video. The example we worked through in depth was the one by one followed by five by five layer. So maybe the one by one has 16 channels, and then the five by five will output a 28 by 28 by let say, 32 channels. And this was the example we worked through on the last slide of the previous video. Then, to save computation on your three by three convolution, you can also do the same here. And then the three by three outputs 28 by 28 by 128. And then, maybe you want to consider a one by one convolution as well. There's no need to do a one by one conv followed by another one by one conv. So there's just one step here. And let's say this outputs 28 by 28 by 64. And then finally is the pooling layer. And here we're going to do something funny. So we are going to use- we want to use a same padding for max pooling. So the output is 28 by 28 and by- so here we're going to do something funny. In order to deliver concatenate all of these outputs at the end, we are going to use same type of padding for pooling so that the output item with is still 28 by 28. So we can concatenate it with these other outputs. But notice that if you do max pooling, even with same padding three by three filter is right at one. The output here will be twenty 28 by 28 by 192. You will have the same number of channels and the same depth as the input that we had here. So this seems like it has a lot of channels. So what we're going to do is actually add one more one by one conv layer to then do what we saw in the one by one convolutional video to shrink the number of channels so as to get this down to 28 by 28 by, let's say, 32. And the way you do that is to use 32 filters of dimension one by one by 192. So that's why the output dimension has the number of channels shrunk down to 32, so then you don't end up with the pooling layer taking up all the channels in the final output. And then finally, you take these all of these blocks and you do channel concatenation, just concatenate across this 64 plus 128 plus 32 plus 32, and this if you add it up, this gives you a 28 by 28 by 256 dimensional output. Channel concat is just concatenating the blocks that we saw in the previous video. So this is one Inception module. And what the Inception network does is more or less put a lot of these modules together. Here's a picture of the Inception that were taken from the paper by Szegety et al. And you notice a lot of repeated blocks in this. Maybe this picture looks really complicated but you look at one of the blocks there, that block is basically the Inception module that you saw on the previous slide. And subject to a little details I won't discuss, this is another Inception block, this is another Inception block. There's some extra Max pooling layers here to change the dimension of the height and width. But there's another Inception block, and then there's another max pool here to change the height and width but basically there's another Inception block. But the Inception network is just a lot of these blocks that you've learned about repeated to different positions of the network. And so if you understand the Inception block from the previous slide, then you understand the Inception network. It turns out that this one last detail to the Inception network if you read the original research paper, which is that there are these additional side branches that I just added. So what do they do? Well, the last few layers of the network is a fully connected layer followed by a softmax layer to try to make a prediction. What these side branches do is it take some hidden layer, and it tries to use that to make a prediction. So this is actually a softmax output, and so is that. And this other side branch, again takes a hidden layer passes through a few layers, a few fully connected layers, and then as a softmax tried to predict what's the output label. You should think of this as maybe just another detail of the Inception network, but what it does is it helps ensure that the features computer even in the hidden units, even at that intermediate layers that they're not too bad for predicting the output cause of a image. And this appears to have a regularizing effect on the Inception network and helps prevent this network from overfitting. Oh, and by the way, this particular Inception network was developed by authors at Google who called it GoogLenet spelled like that to pay homage to the LeNet network that you learned about in an earlier video as well. So I think it's actually really nice that the deep learning community is so collaborative and that there's such strong healthy respect for each other's work in the deep learning community. Finally, here's one fun fact. Where does the name Inception network come from? The Inception paper actually cites this meme for we need to go deeper, and this URL is an actual reference in the Inception paper which links to this image. And if you've seen the movie titled The Inception, maybe this meme will make sense to you but the authors actually cite this meme as motivation for needing to build deeper neural networks and that's how they came up with the Inception architecture. So I guess it's not often that research papers get to cite internet memes in their citations but in this case, I guess it worked out quite well. So to summarize, if you understand the Inception module, then you understand the Inception network, which is largely the Inception module repeated a bunch of times throughout the network. Since the development of the original Inception module the authors and others have built on it and come up with other versions as well. So there are research papers on newer versions of the Inception algorithm and you sometimes see people use some of these later versions as well in their work, like Inception V2, Inception V3, Inception V4. There's also an Inception version that's combined with the resident idea of having skip connections and that sometimes works even better. But all of these variations are built on the basic idea that you learned about in this and the previous video of coming up with the Inception module and then stacking up a bunch of them together. And with these videos, you should be able to read and understand, I think, the Inception paper, as well as maybe some of the papers describing the later variations as well. So that;s it, you've gone through quite a lot of specialized neural network architectures. In the next video, I want to start showing you some more practical advice on how to actually use these algorithms to build your own computer vision system. Let's go on to the next video.