In this and the next video I want to work through a detailed example, showing how a neural network can compute a complex nonlinear function of the input and hopefully, this will give you a good sense of why Neural Networks can be used to learn complex, nonlinear hypotheses. Consider the following problem where we have input features x1 and x2 that are binary values, so either zero or one. So x1 and x2 can each take on only one of two possible values. In this example, I've drawn only two positive examples and two negative examples, but you can think of this as a simplified version of a more complex learning problem where we may have a bunch of positive examples in the upper right and the lower left and a bunch of negative examples to notify the circles, and what we'd like to do is learn a nonlinear, you know, decision boundary that we need to separate the positive and the negative examples. So how can a neural network do this and rather than use an example on the right. I'm going to use this, maybe easier to examine example on the left. Concretely, what this is is really computing the target label y equals x1 XOR x2. Or this is actually the x1 XNOR x2 function where XNOR is the alternative notation for "not x1 or x2". So x1, XOR or x2 - that's true only if exactly one of x1 or x2 is equal to 1. It turns out that the specific example I'm going to use works out a little bit better if we use the XNOR example, instead. These two are the same, of course. It means not x1 XOR x2, and so we're going to have positive examples if either both are true or both are false and we'll have that's y equals 1, y equals 1 and we're going to have y equals 0 if only one of them is true and we want to figure out if we can get a neural network to fit to this sort of training set. In order to build up to a network that fits the XNOR example, we're going to start to a slightly simpler one and show a network that fits the AND function. Concretely, lets say we have inputs x1 and x2 that are again binary. So, it's either zero or one. And let's say our target labels y are you know, is equal to x1 and x2. This is a logical AND. So can we get a one unit network to compute this logical AND function? In order to do so, I'm going to actually draw in the bias unit as well, the plus one unit. Now, let me just assign some values to the weights or the parameters of this network. I am going to write down the parameters on this diagram. Write minus 30 here plus 20 and plus 20 and what this means is that I'm assigning a value of minus thirty to the value associated with x0. This is plus 1 going to this unit and a parameter value of plus 20 that multiplies in x1 in a value of plus 20 for the parameter that multiplies into x2. So, concretely, this is saying that my hypotheses h of x is equal to g of -30 + 20x1 + 20x2. So sometimes it's just convenient to draw these weights and draw these parameters up here, you know, in the diagram of the neural network. And of course this minus 30 this is actually theta 1 of 1,0. This is theta 1 of 1,1 and that's theta 1 of 1,2 but it's just easier think about it as, you know, associating these parameters with the edges of the network. Let's look at what this little single neuron network will compute. Just to remind you, the sigmoid activation function g of z looks like this. It starts from 0, rises smoothly, crosses 0.5, and then it asymptotes at one. And to give you some landmarks, if the horizontal axis value z is equal to 4.6, then the sigmoid function is equal to 0.99. This is very close to 1 and kind of symmetrically if it is negative 4.6, then the sigmoid function there is equal to 0.01 which is very close to 0. Let's look at the four possible input values for x1 and x2 and look at whether the hypothesis will open in that case. If x1 and x2 are both equal to 0 - if you look at this, if x1 and x2 are both equal to 0 then the hypotheses of point g of -30. So, it's like very far to the left of this diagram. This will be very close to 0. If x1 equals 0 and x2 equals 1 then this formula here evaluates to g, thus the sigmoid function applied to -10 and again, that's, you know, to the far left of this plot and so, that's again very close to 0. This is also g of -10. That is if x1 is equal to 1 and x(2)0, this is -30 plus 20, which is -10. And finally if x1 equals 1, x2 equals 1, then you have g of -30 +20 +20, so that's g of +10, which is therefore very close to 1. And if you look in this column, this is exactly the logical "and" function. So, this is computing h of x is, you know, approximately x1 and x2. In other words, it outputs 1 if and only if x1 and x2 are both equal to 1. So by writing out our little truth table like this, we manage to figure out what's the logical function that our neural network computes. This network shown here computes the OR function just to show you how I worked that out. If you are to write out the hypotheses you find that it's computing g of -10 +20 x1 +20 x2. And so if you fill in these values you find that's g of -10 which is approximately 0, g of 10 which is approximately 1, and so on. These are approximately 1, and approximately 1, and these numbers is essentially the logical OR function. So, hopefully with this, you now understand how single neurons in a neural network can be used to compute logical functions like AND and OR and so on. In the next video, we'll continue building on these examples and work through a more complex example. We'll get to show you how a neural network, now with multiple layers of units can be used to compute more complex functions like the XOR function or the XNOR function.