Now let's imagine we had multiple
classifiers.
One for rain and one for whether the
sprinkler is on or not.
The rain classifier deals with features
like wetness of the grass and thunder,
whereas the sprinkler classifier deals
with wetness of the grass, of course, but
also whether or not one sees a hose on the
ground.
The probabilities for each of the
classifiers work just like before.
The probability of rain, given w and t, is
the probability of w given r, times the
probability of t given r, times p of r,
divided by the probability of w and t.
Which is essentially the probably of
whatever evidence one actually observes.
Assuming one observes both these features.
Similarly, we get the equation for the
probability of a sprinkler, given an
observation of h and w.
But, w is the same observation.
If you are observing w for this
classifier, we should have the same value
of w in this classifier.
So, we have to actually combine these
networks into one network, and what we get
is called a Bayesian network.
Now, it's not a simple classifier, there
are two things we could make conclusions
about, sprinkler and rain,
And some of the evidence is shared, let's
try to work out the probability For this
network, starting with the joint.
The joint probability of H, W, T, S and R.
Using Bayes rule to factor R out, we get
this first term, and we factor S out,
again, we'll get H, W, T given S and R,
and the probability of S given R, and the
probability of R.
This is simply a recursive application of
Bayes rule.
We also know that S and R are independent.
Whether or not the sprinkler is on or
whether or not it, it had rained last
night, we assume that, that these are
independent.
In, in practice we might have some
dependence.
That is one decides to put the sprinkler
on only if it doesn't rain, but we ignore
that for the time being.
We also assume as before that the, the
observations H, W and T are independent
given S and R.
And this lets us factor this further as
probability of H given S and R, times the
probability of W given S and R, times the
probability of T given S and R and.
Because of independence of s and r, we get
probability of s times probability of r.
Now, this part is tricky.
H and R are also independent because the
probability of a hose lying on the ground
doesn't depend on whether it's rained and
the probability of thunder occurring
certainly doesn't depend on whether the
sprinkler has been put on.
So these two terms, can be replaced by the
probability of H given S and the
probability of T given R.
As a general rule in a vision network, and
this is all you really need to understand
or remember.
All you need to deal with are the
conditional probabilities of nodes given
their parents.
So we have H given S, T given R and W
given S and R, because it has two parents.
S and R don't have any parents, so they
sit on their own as prior probabilities.
So this basic Bayesian that were in
equation.
Gives us the joint probability.
From this one can compute whatever one
actually requires simply using sequel by
applying evidence joining and aggregating.
Let's do a couple of examples.
For example, we will ignore the hose and
thunder and just deal with wetness, but
now there are two possible causes for
wetness: sprinkler being on and the rain
being on.
This is exactly what we started with, the
confusing situation that classical logic
had trouble dealing with, and let's see
how probabilistic logic using a Bayesian
network does better, perhaps.
We have the conditional probability of w
given s and r, note that this is not the
joint probability.
So for any particular combination of s and
r, these two la- rows for different values
of w have to add up to one.
So point nine and point one, point seven
and point three, point eight point two,
etcetera.
We also have the priors of probability
that it rains in general, and the
probability the sprinkler is on, in
general, based on history.
Let's write our joint distribution now.
Probability of r, s, and w, as probability
of s and r given w  probability of w.
And alternatively, as probability of w
given s and r  probability of s 
probability of r.
Because we know that these two are
independent.
Given that the evidence that we see on the
ground as the grass is wet, we can now say
that the probability of r and s given w is
the probabilty of W given R and S,
Times probability of S, times probability
of R.
And now we condition it by the evidence
that the grass is wet, but we are only
interested in the probability of R so we
SUM out S.
So now we're going to apply the Summation
operator in addition to the Selection
operator of'W equal to Yes. Sigma, as
before, is the probability of the inverse
of the probability that W equal to" Yes.
Essentially, we get that simply by
dividing byt P of W.
In this equation.
Convince yourself that this is the case,
and let's see how one can execute this in
SQL.
Just like before, we select R, the sum of
all the products of the Ps from all the
three tables, from all the three tables, W
equal to yes, R equal to R, and S equal to
S in all the tables.
So I have, haven't written the full SQL
over here, just have to make sure that all
the common variables are equal, and the
evidence is applied.
And we finally group by R,
Because we don't want anything to do with
S.
The results should have one row for every
value of R.
The result, it turns out, is the
following.
Let's look at the first row.
We take the nine.
From here.
Multiply it by the case where s is equal
to yes.
Because here, s equal to yes.
So we multiply by.3.
There's another situation where s equal to
no where we also have rain equal to yes
and w equal to yes so that's.8 and that's
multiplied by.7.
Both of these are cases where rain equal
to yes.
So they're multiplied by.2 from this table
and we get.166.
Similarly, in the case when rain equal to
no.
S equal to yes and W equal to yes, of
course..7 times.3.
Then we have.1 from this row, and S equal
to no, so it's multiplied by.7, and of
course, both are the case where rain equal
to no, so we multiply by.8.
We get.224.
Normalizing this so that the sum is one,
we get.166, divide by.166 plus.224 is.42.
Or in other words, the expectation or the
probability that the rain is, it happened
last night given that the grass is wet is
42%.
It's very important to note that earlier,
when we just had rain and wetness and no
sprinkler to talk about, we got a
probability of 53% that it had rained if
the grass was observed to be wet.
Now, without knowing anything about the
sprinkler being on or not on, simply the
fact that there is a possible alternate
cause for the grass being wet with an
explicit probability table that includes
this.
We get a lower value for the probability
that it had rained.
The fact is, that given the fact, that the
grass is wet.
The probability that it had rained,
depends on whether or not it had, the
sprinkler was on.
Even if we don't observe S, the fact that
there's an alternate cause does change the
probability from the situation where this
node didn't even occur.
Let's see that more explicitly now by
seeing what happens if we do observe that
the sprinkler is on.