1 00:00:00,000 --> 00:00:03,145 Now let's imagine we had multiple classifiers. 2 00:00:03,145 --> 00:00:07,247 One for rain and one for whether the sprinkler is on or not. 3 00:00:07,247 --> 00:00:12,580 The rain classifier deals with features like wetness of the grass and thunder, 4 00:00:12,580 --> 00:00:17,981 whereas the sprinkler classifier deals with wetness of the grass, of course, but 5 00:00:17,981 --> 00:00:21,400 also whether or not one sees a hose on the ground. 6 00:00:21,400 --> 00:00:25,974 The probabilities for each of the classifiers work just like before. 7 00:00:25,974 --> 00:00:31,356 The probability of rain, given w and t, is the probability of w given r, times the 8 00:00:31,356 --> 00:00:36,469 probability of t given r, times p of r, divided by the probability of w and t. 9 00:00:36,469 --> 00:00:41,649 Which is essentially the probably of whatever evidence one actually observes. 10 00:00:41,649 --> 00:00:47,030 Assuming one observes both these features. Similarly, we get the equation for the 11 00:00:47,030 --> 00:00:51,000 probability of a sprinkler, given an observation of h and w. 12 00:00:53,100 --> 00:00:57,725 But, w is the same observation. If you are observing w for this 13 00:00:57,725 --> 00:01:02,574 classifier, we should have the same value of w in this classifier. 14 00:01:02,574 --> 00:01:08,467 So, we have to actually combine these networks into one network, and what we get 15 00:01:08,467 --> 00:01:13,540 is called a Bayesian network. Now, it's not a simple classifier, there 16 00:01:13,540 --> 00:01:18,464 are two things we could make conclusions about, sprinkler and rain, 17 00:01:18,464 --> 00:01:24,677 And some of the evidence is shared, let's try to work out the probability For this 18 00:01:24,677 --> 00:01:30,735 network, starting with the joint. The joint probability of H, W, T, S and R. 19 00:01:30,735 --> 00:01:37,218 Using Bayes rule to factor R out, we get this first term, and we factor S out, 20 00:01:37,218 --> 00:01:43,788 again, we'll get H, W, T given S and R, and the probability of S given R, and the 21 00:01:43,788 --> 00:01:48,821 probability of R. This is simply a recursive application of 22 00:01:48,821 --> 00:01:53,491 Bayes rule. We also know that S and R are independent. 23 00:01:53,491 --> 00:01:57,512 Whether or not the sprinkler is on or whether or not it, it had rained last 24 00:01:57,512 --> 00:02:00,085 night, we assume that, that these are independent. 25 00:02:00,085 --> 00:02:02,497 In, in practice we might have some dependence. 26 00:02:02,497 --> 00:02:06,893 That is one decides to put the sprinkler on only if it doesn't rain, but we ignore 27 00:02:06,893 --> 00:02:11,284 that for the time being. We also assume as before that the, the 28 00:02:11,284 --> 00:02:15,614 observations H, W and T are independent given S and R. 29 00:02:15,614 --> 00:02:22,231 And this lets us factor this further as probability of H given S and R, times the 30 00:02:22,231 --> 00:02:28,440 probability of W given S and R, times the probability of T given S and R and. 31 00:02:28,440 --> 00:02:34,620 Because of independence of s and r, we get probability of s times probability of r. 32 00:02:36,860 --> 00:02:43,082 Now, this part is tricky. H and R are also independent because the 33 00:02:43,082 --> 00:02:49,192 probability of a hose lying on the ground doesn't depend on whether it's rained and 34 00:02:49,192 --> 00:02:54,794 the probability of thunder occurring certainly doesn't depend on whether the 35 00:02:54,794 --> 00:02:59,740 sprinkler has been put on. So these two terms, can be replaced by the 36 00:02:59,740 --> 00:03:03,960 probability of H given S and the probability of T given R. 37 00:03:05,800 --> 00:03:10,770 As a general rule in a vision network, and this is all you really need to understand 38 00:03:10,770 --> 00:03:14,957 or remember. All you need to deal with are the 39 00:03:14,957 --> 00:03:19,320 conditional probabilities of nodes given their parents. 40 00:03:19,660 --> 00:03:28,180 So we have H given S, T given R and W given S and R, because it has two parents. 41 00:03:28,180 --> 00:03:34,500 S and R don't have any parents, so they sit on their own as prior probabilities. 42 00:03:34,500 --> 00:03:38,100 So this basic Bayesian that were in equation. 43 00:03:38,100 --> 00:03:42,977 Gives us the joint probability. From this one can compute whatever one 44 00:03:42,977 --> 00:03:48,759 actually requires simply using sequel by applying evidence joining and aggregating. 45 00:03:48,759 --> 00:03:53,757 Let's do a couple of examples. For example, we will ignore the hose and 46 00:03:53,757 --> 00:03:59,355 thunder and just deal with wetness, but now there are two possible causes for 47 00:03:59,355 --> 00:04:02,917 wetness: sprinkler being on and the rain being on. 48 00:04:02,917 --> 00:04:08,878 This is exactly what we started with, the confusing situation that classical logic 49 00:04:08,878 --> 00:04:14,694 had trouble dealing with, and let's see how probabilistic logic using a Bayesian 50 00:04:14,694 --> 00:04:19,486 network does better, perhaps. We have the conditional probability of w 51 00:04:19,486 --> 00:04:23,386 given s and r, note that this is not the joint probability. 52 00:04:23,386 --> 00:04:29,100 So for any particular combination of s and r, these two la- rows for different values 53 00:04:29,100 --> 00:04:33,605 of w have to add up to one. So point nine and point one, point seven 54 00:04:33,605 --> 00:04:36,765 and point three, point eight point two, etcetera. 55 00:04:36,765 --> 00:04:41,606 We also have the priors of probability that it rains in general, and the 56 00:04:41,606 --> 00:04:45,640 probability the sprinkler is on, in general, based on history. 57 00:04:45,640 --> 00:04:51,997 Let's write our joint distribution now. Probability of r, s, and w, as probability 58 00:04:51,997 --> 00:04:58,033 of s and r given w probability of w. And alternatively, as probability of w 59 00:04:58,033 --> 00:05:02,057 given s and r probability of s probability of r. 60 00:05:02,057 --> 00:05:05,840 Because we know that these two are independent. 61 00:05:08,500 --> 00:05:16,741 Given that the evidence that we see on the ground as the grass is wet, we can now say 62 00:05:16,741 --> 00:05:26,180 that the probability of r and s given w is the probabilty of W given R and S, 63 00:05:26,740 --> 00:05:30,020 Times probability of S, times probability of R. 64 00:05:30,300 --> 00:05:36,621 And now we condition it by the evidence that the grass is wet, but we are only 65 00:05:36,621 --> 00:05:40,754 interested in the probability of R so we SUM out S. 66 00:05:40,754 --> 00:05:47,237 So now we're going to apply the Summation operator in addition to the Selection 67 00:05:47,237 --> 00:05:52,991 operator of'W equal to Yes. Sigma, as before, is the probability of the inverse 68 00:05:52,991 --> 00:05:58,259 of the probability that W equal to" Yes. Essentially, we get that simply by 69 00:05:58,259 --> 00:06:02,380 dividing byt P of W. In this equation. 70 00:06:03,360 --> 00:06:08,857 Convince yourself that this is the case, and let's see how one can execute this in 71 00:06:08,857 --> 00:06:11,807 SQL. Just like before, we select R, the sum of 72 00:06:11,807 --> 00:06:17,238 all the products of the Ps from all the three tables, from all the three tables, W 73 00:06:17,238 --> 00:06:21,327 equal to yes, R equal to R, and S equal to S in all the tables. 74 00:06:21,327 --> 00:06:26,624 So I have, haven't written the full SQL over here, just have to make sure that all 75 00:06:26,624 --> 00:06:30,580 the common variables are equal, and the evidence is applied. 76 00:06:30,860 --> 00:06:36,812 And we finally group by R, Because we don't want anything to do with 77 00:06:36,812 --> 00:06:40,664 S. The results should have one row for every 78 00:06:40,664 --> 00:06:44,341 value of R. The result, it turns out, is the 79 00:06:44,341 --> 00:06:47,668 following. Let's look at the first row. 80 00:06:47,668 --> 00:06:49,769 We take the nine. From here. 81 00:06:49,769 --> 00:06:53,183 Multiply it by the case where s is equal to yes. 82 00:06:53,183 --> 00:06:56,860 Because here, s equal to yes. So we multiply by.3. 83 00:06:57,440 --> 00:07:04,527 There's another situation where s equal to no where we also have rain equal to yes 84 00:07:04,527 --> 00:07:09,480 and w equal to yes so that's.8 and that's multiplied by.7. 85 00:07:09,480 --> 00:07:13,579 Both of these are cases where rain equal to yes. 86 00:07:13,579 --> 00:07:18,532 So they're multiplied by.2 from this table and we get.166. 87 00:07:18,532 --> 00:07:22,290 Similarly, in the case when rain equal to no. 88 00:07:22,290 --> 00:07:27,911 S equal to yes and W equal to yes, of course..7 times.3. 89 00:07:27,911 --> 00:07:35,884 Then we have.1 from this row, and S equal to no, so it's multiplied by.7, and of 90 00:07:35,884 --> 00:07:42,835 course, both are the case where rain equal to no, so we multiply by.8. 91 00:07:42,835 --> 00:07:49,228 We get.224. Normalizing this so that the sum is one, 92 00:07:49,228 --> 00:07:58,478 we get.166, divide by.166 plus.224 is.42. Or in other words, the expectation or the 93 00:07:58,478 --> 00:08:07,957 probability that the rain is, it happened last night given that the grass is wet is 94 00:08:07,957 --> 00:08:12,199 42%. It's very important to note that earlier, 95 00:08:12,199 --> 00:08:18,335 when we just had rain and wetness and no sprinkler to talk about, we got a 96 00:08:18,335 --> 00:08:24,388 probability of 53% that it had rained if the grass was observed to be wet. 97 00:08:24,388 --> 00:08:30,938 Now, without knowing anything about the sprinkler being on or not on, simply the 98 00:08:30,938 --> 00:08:37,405 fact that there is a possible alternate cause for the grass being wet with an 99 00:08:37,405 --> 00:08:41,220 explicit probability table that includes this. 100 00:08:41,580 --> 00:08:47,396 We get a lower value for the probability that it had rained. 101 00:08:47,396 --> 00:08:52,631 The fact is, that given the fact, that the grass is wet. 102 00:08:52,631 --> 00:08:59,611 The probability that it had rained, depends on whether or not it had, the 103 00:08:59,611 --> 00:09:04,819 sprinkler was on. Even if we don't observe S, the fact that 104 00:09:04,819 --> 00:09:11,121 there's an alternate cause does change the probability from the situation where this 105 00:09:11,121 --> 00:09:15,643 node didn't even occur. Let's see that more explicitly now by 106 00:09:15,643 --> 00:09:20,240 seeing what happens if we do observe that the sprinkler is on.