The cost function of the neural style transfer algorithm had a content cost component and a style cost component. Let's start by defining the content cost component. Remember that this is the overall cost function of the neural style transfer algorithm. So, let's figure out what should the content cost function be. Let's say that you use hidden layer l to compute the content cost. If l is a very small number, if you use hidden layer one, then it will really force your generated image to pixel values very similar to your content image. Whereas, if you use a very deep layer, then it's just asking, "Well, if there is a dog in your content image, then make sure there is a dog somewhere in your generated image. " So in practice, layer l chosen somewhere in between. It's neither too shallow nor too deep in the neural network. And because you plan this yourself, in the problem exercise that you did at the end of this week, I'll leave you to gain some intuitions with the concrete examples in the problem exercise as well. But usually, I was chosen to be somewhere in the middle of the layers of the neural network, neither too shallow nor too deep. What you can do is then use a pre-trained ConvNet, maybe a VGG network, or could be some other neural network as well. And now, you want to measure, given a content image and given a generated image, how similar are they in content. So let's let this a_superscript_[l](c) and this be the activations of layer l on these two images, on the images C and G. So, if these two activations are similar, then that would seem to imply that both images have similar content. So, what we'll do is define J_content(C,G) as just how soon or how different are these two activations. So, we'll take the element-wise difference between these hidden unit activations in layer l, between when you pass in the content image compared to when you pass in the generated image, and take that squared. And you could have a normalization constant in front or not, so it's just one of the two or something else. It doesn't really matter since this can be adjusted as well by this hyperparameter alpha. So, just be clear on using this notation as if both of these have been unrolled into vectors, so then, this becomes the square root of the l_2 norm between this and this, after you've unrolled them both into vectors. There's really just the element-wise sum of squared differences between these two activation. But it's really just the element-wise sum of squares of differences between the activations in layer l, between the images in C and G. And so, when later you perform gradient descent on J_of_G to try to find a value of G, so that the overall cost is low, this will incentivize the algorithm to find an image G, so that these hidden layer activations are similar to what you got for the content image. So, that's how you define the content cost function for the neural style transfer. Next, let's move on to the style cost function.