[MUSIC] Now we are ready to describe
our logistic regression model. It takes a score as input that ranges
from minus infinity to plus infinity and is actually w transposed h of x i. It pushes it through the sigmoid function. To estimate the probability
that y=+1 given xi and w. So what does that mean more explicitly is that this probability is equal to 1/1+e to the minus score of (x)i. Which is the same as saying 1 / 1+e to the -w transpose h(xi). And we can just for fun write out that w transpose h explicitly. So it's 1 + e to the power of -w0h0 of xi + w1h1 of xi + dot dot dot + w capital D, h D(x i). [SOUND] And now we have it, that's what a logistic
regression model looks like. It predicts the output, what's
the probability of a positive sentiment given the input x and the parameter is w. Now let's take a moment to understand the logistic regression
model a little bit better. So as input we have this
core of a sentence x or any other input that we have, and
as output we have the probability that the label is +1 given the input x and
the parameters w. And that's the 1 over e to
the power of -w transpose h of x. Now, if the score is zero and I'm going to draw it like this,
we have that this probability is 0.5. So, if the score is zero,
the probability is 0.5. Now, what do I observe first hand,
is everything to the left of zero. Has score less than zero,
so we should be predicting that this point on the left
have y hat equals minus one and everything to the right of zero
has score greater than zero. So we should be predicting that y hat
on the right side is equal to plus one. So let's see that in action. So for example, let's say that
we had the score of minus two, what would happen to our prediction? So we say the probability
of y = +1 is actually 0.12 if you plugged that in so
minus two gives you 0.12. If you have plus two. And you push it to the right side,
you get 0.88. So if the score is +2, it's 0.88. Is it a surprise to you that
0.12 + 0.8 adds up to 1? It's not a surprise because the probably
of y = plus 1 plus the probability of y = minus 1 adds up to 1, and
sigmoid is a symmetric function, so everything is working out
exactly the way we hoped for. Now if the score is bigger,
let's say the score is four, we should still output a y = +1,
but we should be more sure. So let's push that through
if the score is four look we're getting really big here and the prediction of the probability is 0.98. In other words, for
the points where the score is less than zero you see the probability is less than 0.5 of being y = +1, which implies that we've
output a y hat of minus one. Well for
the ones where the score is positive we're going to output y hat
is equal to plus one. And here we see in action
the logistic regression model and how it has the characteristics
that we're hoping for. [MUSIC]