1 00:00:00,460 --> 00:00:01,600 This video gives a live demonstration 2 00:00:02,700 --> 00:00:04,250 of the recursive constructs in sequel 3 00:00:04,700 --> 00:00:06,090 that we introduced in the previous video. 4 00:00:07,200 --> 00:00:08,460 As a reminder, recursion has been 5 00:00:08,630 --> 00:00:09,730 introduced intosequal as part of 6 00:00:10,120 --> 00:00:11,180 the with statement where we can 7 00:00:11,490 --> 00:00:12,420 set up relations that are defined 8 00:00:12,870 --> 00:00:14,720 by queries that themselves refer to 9 00:00:14,830 --> 00:00:16,770 the relation being defined, and finally 10 00:00:17,080 --> 00:00:17,870 we have a query that can 11 00:00:18,070 --> 00:00:19,570 involve the recursively defined relations 12 00:00:20,000 --> 00:00:22,210 as well as other relations or other tables in the database. 13 00:00:23,400 --> 00:00:24,820 The typical expression within a 14 00:00:24,960 --> 00:00:26,200 with statement for a recursively 15 00:00:26,600 --> 00:00:27,850 defined relation would be to 16 00:00:28,140 --> 00:00:29,290 have a base query that doesn't 17 00:00:29,610 --> 00:00:30,940 depend on R and 18 00:00:31,110 --> 00:00:32,340 then a recursive query that does 19 00:00:32,540 --> 00:00:34,110 depend on R. We gave 20 00:00:34,290 --> 00:00:35,500 three examples in the introductory 21 00:00:35,800 --> 00:00:36,620 video and those are the 22 00:00:36,730 --> 00:00:38,570 same examples that we'll be demonstrated shortly. 23 00:00:39,670 --> 00:00:40,540 The first one was to compute 24 00:00:41,790 --> 00:00:42,930 ancestors when we have only a 25 00:00:42,980 --> 00:00:43,870 parent relation and the family 26 00:00:44,230 --> 00:00:45,520 tree could be arbitrarily deep. 27 00:00:46,570 --> 00:00:48,230 Our second example was a 28 00:00:48,270 --> 00:00:49,110 case where we have an arbitrarily 29 00:00:49,730 --> 00:00:51,340 deep company hierarchy and we 30 00:00:51,550 --> 00:00:52,660 want to compute the total salary 31 00:00:53,060 --> 00:00:54,160 cost of a project starting 32 00:00:54,520 --> 00:00:56,010 at that project's manager and summing 33 00:00:56,280 --> 00:00:57,910 the salary of the entire sub-tree. 34 00:00:59,510 --> 00:01:01,670 and our third example was about airplane flights. 35 00:01:02,010 --> 00:01:02,960 Where we want to find the cheapest 36 00:01:03,150 --> 00:01:04,200 way to fly from point A 37 00:01:04,410 --> 00:01:05,730 to point B. And we're willing 38 00:01:06,230 --> 00:01:07,310 to change planes as many kinds, 39 00:01:07,800 --> 00:01:09,520 as we might need to in order to bring down the cost. 40 00:01:10,480 --> 00:01:11,420 We saw that all of these 41 00:01:11,520 --> 00:01:13,570 examples, involved, basically a 42 00:01:13,750 --> 00:01:14,910 notion of transit of closure computed 43 00:01:15,610 --> 00:01:16,680 as a recursively defined relation. 44 00:01:18,230 --> 00:01:18,830 The last portion of our demo, after 45 00:01:18,930 --> 00:01:19,846 we see these three queries solved using recursion, will introduce one more twist, 46 00:01:20,219 --> 00:01:22,830 which is what happens when we introduce cycles. 47 00:01:23,210 --> 00:01:27,570 So in the Airline example. 48 00:01:28,030 --> 00:01:28,870 We'll set up a case where 49 00:01:28,950 --> 00:01:31,180 you can fly from one city to another one and back. 50 00:01:31,870 --> 00:01:32,830 Which is of course true in 51 00:01:32,940 --> 00:01:34,100 reality, and we'll see what 52 00:01:34,230 --> 00:01:36,630 happens when we try to answer our query in that setting. 53 00:01:37,910 --> 00:01:40,110 I've started by creating a table called Parent of. 54 00:01:40,450 --> 00:01:41,450 With parent child relationships. 55 00:01:42,650 --> 00:01:45,520 So we have, Alice Carol, Bob Carol, Carol Dave, and so on. 56 00:01:45,690 --> 00:01:46,580 You might actually want to write 57 00:01:46,740 --> 00:01:47,700 this down on a piece of paper 58 00:01:48,350 --> 00:01:50,220 to see what the actual tree 59 00:01:50,500 --> 00:01:51,570 looks like, but the query 60 00:01:51,820 --> 00:01:52,690 we want to run is to 61 00:01:53,010 --> 00:01:54,620 find all of Mary's ancestors 62 00:01:55,710 --> 00:01:56,890 so we're going to, of course, 63 00:01:57,230 --> 00:01:59,020 have Eve as a parent, and Dave as a parent. 64 00:01:59,240 --> 00:02:02,040 And then Dave's parent is Carol, and Carol's parent is Bob, and so on. 65 00:02:02,370 --> 00:02:05,290 We'll get most of the data in our database in our query. 66 00:02:06,180 --> 00:02:07,310 So here is the query, 67 00:02:07,330 --> 00:02:09,270 our first example of our recursive query. 68 00:02:09,880 --> 00:02:10,950 Let me say right off, that 69 00:02:11,270 --> 00:02:12,530 even more than anything else, 70 00:02:12,760 --> 00:02:13,880 we've done in these videos, I am 71 00:02:13,940 --> 00:02:15,380 going to encourage you to download 72 00:02:16,220 --> 00:02:17,270 the script and take a close 73 00:02:17,940 --> 00:02:19,060 look at the query, and preferable actually 74 00:02:19,980 --> 00:02:22,230 run the queries and play with them on the postscript system. 75 00:02:22,650 --> 00:02:23,750 And we are using for 76 00:02:23,910 --> 00:02:25,350 this demo postscript a sequel 77 00:02:25,720 --> 00:02:26,640 lite and my sequel do not 78 00:02:27,000 --> 00:02:29,260 support forth the Withery cursive statement at this point in time. 79 00:02:30,120 --> 00:02:31,350 So, anyway here's our query 80 00:02:32,110 --> 00:02:33,210 and it is the form 81 00:02:33,520 --> 00:02:34,810 that we described in the introduction, 82 00:02:34,950 --> 00:02:36,190 it's a width-statement with 83 00:02:36,410 --> 00:02:37,330 recursive that's going to 84 00:02:37,420 --> 00:02:39,210 set up a recursive relation called 85 00:02:39,580 --> 00:02:42,220 ancestor, so this is what we were calling "r" earlier. 86 00:02:42,660 --> 00:02:43,980 This is our ancestor with a 87 00:02:44,200 --> 00:02:46,080 schema AD for ancestor and descendant. 88 00:02:46,820 --> 00:02:47,930 Our final query, once 89 00:02:48,260 --> 00:02:49,820 ancestor is all set up, 90 00:02:50,060 --> 00:02:51,050 is very simple, it just 91 00:02:51,260 --> 00:02:52,890 says, take the 'A' 92 00:02:53,320 --> 00:02:54,600 attribute from ancestor, where a 93 00:02:54,630 --> 00:02:57,320 descendant is Mary so that will give us Mary's defendant. 94 00:02:58,260 --> 00:02:59,650 Of course what's interesting is what's 95 00:02:59,940 --> 00:03:01,480 right here,inside these parans 96 00:03:01,870 --> 00:03:03,520 because this our recursive query. 97 00:03:04,230 --> 00:03:05,090 And it does take the form 98 00:03:05,380 --> 00:03:06,780 we described of having a base 99 00:03:07,090 --> 00:03:07,780 query, that is the first 100 00:03:08,060 --> 00:03:08,820 line, and then the recursive 101 00:03:09,440 --> 00:03:10,780 query with a union between them. 102 00:03:11,590 --> 00:03:13,000 So we're going to start by 103 00:03:13,200 --> 00:03:14,590 saying that whenever we have 104 00:03:14,730 --> 00:03:17,610 a parent child relationship that's also an ancestor relationship. 105 00:03:18,500 --> 00:03:21,220 So we're going to take from our parent of table. 106 00:03:22,210 --> 00:03:23,490 The parent and child, and 107 00:03:23,720 --> 00:03:24,870 we have to rename them as 108 00:03:25,020 --> 00:03:26,170 A and D, and that says 109 00:03:26,520 --> 00:03:28,550 that parent children are an ancestor. 110 00:03:29,400 --> 00:03:30,120 What else in an ancestor? 111 00:03:31,200 --> 00:03:32,350 Well if we have a 112 00:03:32,600 --> 00:03:34,050 tuple an ancestor, an ancestor 113 00:03:34,430 --> 00:03:36,510 and a descendant, and that 114 00:03:36,640 --> 00:03:39,220 descendant is the parent of a 115 00:03:39,970 --> 00:03:42,220 another person, then the A 116 00:03:42,930 --> 00:03:45,630 and the ancestor, together with 117 00:03:45,760 --> 00:03:46,760 the child from the parent 118 00:03:47,250 --> 00:03:48,770 of, is also an ancestor relationship. 119 00:03:49,390 --> 00:03:51,300 So this is a kind of doing the join. 120 00:03:52,160 --> 00:03:53,110 Not just "kind of", it's actually 121 00:03:53,550 --> 00:03:55,470 joining our ancestor as its 122 00:03:55,790 --> 00:03:57,750 being created and extending that 123 00:03:58,160 --> 00:04:00,350 relationship by joining with another instance of parent. 124 00:04:00,620 --> 00:04:02,120 So you can kind of think of 125 00:04:02,290 --> 00:04:04,560 going down the ancestor tree 126 00:04:04,960 --> 00:04:06,630 adding relationships as we go down. 127 00:04:07,380 --> 00:04:08,570 Again, I really can't encourage 128 00:04:08,940 --> 00:04:10,430 you enough to download this query 129 00:04:11,160 --> 00:04:12,080 and play with it yourself 130 00:04:12,480 --> 00:04:14,160 to fully understand what's going on. 131 00:04:14,900 --> 00:04:16,010 Let's go ahead and run it. 132 00:04:16,120 --> 00:04:17,110 It's going to be rather anticlimatic. 133 00:04:18,200 --> 00:04:19,210 When we run it we do 134 00:04:19,360 --> 00:04:20,760 discover that these five people 135 00:04:21,580 --> 00:04:23,300 are Mary's ancestors, and if 136 00:04:23,470 --> 00:04:24,690 you've drawn the little tree of 137 00:04:24,830 --> 00:04:27,170 the data you can verify that, that's the correct answer. 138 00:04:28,740 --> 00:04:29,460 Let's play around a little bit. 139 00:04:29,540 --> 00:04:31,120 Let me try a few other people's ancestors. 140 00:04:31,820 --> 00:04:32,390 Let's try Frank. 141 00:04:33,060 --> 00:04:34,430 We don't see Frank here because 142 00:04:34,810 --> 00:04:35,900 Frank actually happens to be 143 00:04:35,990 --> 00:04:37,050 a child of Mary's, so we 144 00:04:37,170 --> 00:04:39,500 should get even more ancestors when 145 00:04:40,170 --> 00:04:41,090 we run this one, especially 146 00:04:41,520 --> 00:04:43,240 Mary should be included, and in 147 00:04:43,440 --> 00:04:45,910 fact, she is, there she is, and these are Frank's ancestors. 148 00:04:47,440 --> 00:04:48,910 Let's try George, I think 149 00:04:49,200 --> 00:04:50,160 George was somewhere in the 150 00:04:50,440 --> 00:04:51,670 middle of the tree there. 151 00:04:52,050 --> 00:04:53,740 Yes, George has three ancestors, 152 00:04:55,170 --> 00:04:56,320 and finally let's try Bob. 153 00:04:57,080 --> 00:04:58,070 Bob is at the root so 154 00:04:58,150 --> 00:04:59,290 we should get an empty result 155 00:04:59,890 --> 00:05:01,260 and we do, because again Bob 156 00:05:01,490 --> 00:05:03,450 has no ancestors in our database. 157 00:05:04,650 --> 00:05:05,900 Now lets take a look at our second example. 158 00:05:06,540 --> 00:05:07,440 That was the one where we had 159 00:05:07,650 --> 00:05:09,540 a hierarchy of management chain 160 00:05:09,750 --> 00:05:11,130 in a company, and then we 161 00:05:11,240 --> 00:05:12,290 were interested in computing the total 162 00:05:12,680 --> 00:05:13,560 salary cost of a project, 163 00:05:13,590 --> 00:05:15,650 so I've set up our three tables. 164 00:05:16,420 --> 00:05:17,340 The first one is the Employee 165 00:05:17,620 --> 00:05:18,840 table, it just gives the IDs 166 00:05:19,200 --> 00:05:21,180 of the employees and the salary of each employee. 167 00:05:22,440 --> 00:05:24,780 The second table is the Manager relationship. 168 00:05:25,650 --> 00:05:26,600 So, again, you might 169 00:05:26,790 --> 00:05:27,700 want to draw the little tree 170 00:05:27,990 --> 00:05:30,080 here although it's pretty simple this time. 171 00:05:30,320 --> 00:05:31,190 123 is at the root 172 00:05:31,550 --> 00:05:33,070 of our little management structure, 173 00:05:33,840 --> 00:05:34,600 as 234 as a subordinate. 174 00:05:35,300 --> 00:05:37,840 234 has two subordinates, 345 and 456 and 345 is another one. 175 00:05:38,460 --> 00:05:40,930 So it's only a 176 00:05:42,500 --> 00:05:43,040 three level tree. 177 00:05:43,370 --> 00:05:46,210 Of course if we knew it was only three levels we wouldn't need recursion at all. 178 00:05:46,490 --> 00:05:49,280 But we're going to write a query that will work for arbitrary numbers of levels. 179 00:05:49,810 --> 00:05:51,110 So that's our management structure. 180 00:05:51,670 --> 00:05:53,150 And finally, our third table, 181 00:05:53,660 --> 00:05:55,270 the Project table says that 182 00:05:55,570 --> 00:05:57,010 employee 123 is the 183 00:05:57,120 --> 00:05:59,150 manager of project X, so 184 00:05:59,400 --> 00:06:00,690 what we want to do is 185 00:06:01,100 --> 00:06:02,170 find the manager of project 186 00:06:02,280 --> 00:06:04,220 Y in the hierarchy,and then take 187 00:06:04,550 --> 00:06:06,660 that manager's salary along with 188 00:06:06,800 --> 00:06:08,030 the salary's of all the 189 00:06:08,190 --> 00:06:10,310 manager's subordinates recursively down 190 00:06:10,540 --> 00:06:13,040 to the management structure, and 191 00:06:13,210 --> 00:06:15,120 of course that's everybody in in our little database. 192 00:06:15,860 --> 00:06:16,830 Again, I can't encourage you 193 00:06:17,010 --> 00:06:19,100 to download the script and run it for yourself. 194 00:06:20,210 --> 00:06:21,200 So here's our query to find 195 00:06:21,600 --> 00:06:23,190 the total salary of project 196 00:06:23,710 --> 00:06:24,620 x and I'm actually going to 197 00:06:24,700 --> 00:06:26,040 give a couple of different ways 198 00:06:26,590 --> 00:06:27,730 of running this for you. 199 00:06:28,230 --> 00:06:29,170 The way we've done it the first 200 00:06:29,450 --> 00:06:30,940 time is to effectively 201 00:06:32,220 --> 00:06:33,740 expand the management structure 202 00:06:34,210 --> 00:06:35,750 into a relation called superior. 203 00:06:36,790 --> 00:06:38,190 So that's really pretty much doing 204 00:06:38,490 --> 00:06:42,110 the ancestor computation, which by the way is a transitive closure. 205 00:06:42,550 --> 00:06:45,220 I should have mentioned that earlier for those of you familiar with transitive closures. 206 00:06:45,600 --> 00:06:46,350 It's basically that operation. 207 00:06:47,000 --> 00:06:48,110 So we're going to compute 208 00:06:48,450 --> 00:06:50,230 these superiors so that 209 00:06:50,530 --> 00:06:51,920 we'll have every manager 210 00:06:52,700 --> 00:06:53,920 and employee relationship with a 211 00:06:54,130 --> 00:06:55,790 manager is arbitrarily above the employee. 212 00:06:56,240 --> 00:06:57,480 And then once we have that 213 00:06:57,780 --> 00:06:59,470 superior relationship computed, then 214 00:06:59,630 --> 00:07:00,700 we write a actually a fairly 215 00:07:00,900 --> 00:07:02,440 complicated query, so this 216 00:07:02,700 --> 00:07:04,210 is the final query of our with statement. 217 00:07:05,160 --> 00:07:07,780 And this one says we've got this recursive relation superior. 218 00:07:08,540 --> 00:07:09,760 We're going to the salaries 219 00:07:10,520 --> 00:07:12,410 from the employee relation where the 220 00:07:12,960 --> 00:07:14,160 ID is either the manager 221 00:07:15,430 --> 00:07:16,800 of the project X, so that's 222 00:07:17,070 --> 00:07:19,250 the first half here, or an 223 00:07:20,420 --> 00:07:22,310 employee, that's managed by the 224 00:07:22,410 --> 00:07:24,880 manager, of project X. Okay. 225 00:07:25,500 --> 00:07:26,380 Now this down here, I just 226 00:07:26,540 --> 00:07:27,690 want to emphasize this is not 227 00:07:27,970 --> 00:07:29,210 recursive, it just so happens 228 00:07:29,620 --> 00:07:30,640 to have that same structure of 229 00:07:30,780 --> 00:07:32,810 union, but there is nothing recursive happening down here. 230 00:07:33,390 --> 00:07:35,050 So this is just a regular sequel query. 231 00:07:35,710 --> 00:07:37,010 Once we have the superior 232 00:07:37,680 --> 00:07:39,650 relation, that's the transitive 233 00:07:40,120 --> 00:07:41,260 closure of the manager relation. 234 00:07:42,120 --> 00:07:42,920 So let's take a look at superior. 235 00:07:43,720 --> 00:07:44,910 Superior, here this is 236 00:07:45,590 --> 00:07:47,070 recursive with the union, says 237 00:07:47,600 --> 00:07:48,960 that if we have a 238 00:07:49,430 --> 00:07:51,200 manager and that's the MID and EID. 239 00:07:52,120 --> 00:07:54,800 Then if somebody is managing someone else then they are their superior. 240 00:07:55,640 --> 00:07:56,480 Notice by the way, I didn't 241 00:07:56,770 --> 00:07:58,010 specify a schema here, so 242 00:07:58,100 --> 00:08:00,000 the schema is implicitly going to be M-I-D-E-I-D. 243 00:08:01,650 --> 00:08:04,040 So we're going to put manager relationships in. 244 00:08:04,190 --> 00:08:05,510 And then, if we have 245 00:08:06,620 --> 00:08:08,460 a superior relationship, so if 246 00:08:08,880 --> 00:08:10,490 we have an MID managing an 247 00:08:10,870 --> 00:08:11,880 EID in the S relationship 248 00:08:12,580 --> 00:08:13,390 then we can add one more 249 00:08:13,520 --> 00:08:14,630 level, because we join 250 00:08:15,110 --> 00:08:16,630 with the managers saying that if 251 00:08:16,810 --> 00:08:17,660 S is a superior 252 00:08:18,280 --> 00:08:19,460 of Y and why is 253 00:08:19,560 --> 00:08:20,980 the manager of Z been X's 254 00:08:21,370 --> 00:08:22,960 superior of Z. This 255 00:08:23,130 --> 00:08:24,400 parallels exactly what we 256 00:08:24,510 --> 00:08:27,300 did with the ancestor computation in the previous example. 257 00:08:28,830 --> 00:08:29,580 Again, it's going to be rather 258 00:08:30,070 --> 00:08:32,550 anti-climactic to run the query, but let's do it. 259 00:08:32,830 --> 00:08:33,860 And, we find out that four 260 00:08:34,080 --> 00:08:35,490 hundred is the total salary 261 00:08:35,910 --> 00:08:37,420 cost of Project X when 262 00:08:37,540 --> 00:08:39,200 we count the manager of project 263 00:08:39,770 --> 00:08:41,440 X together with all of 264 00:08:41,520 --> 00:08:43,800 the people underneath that manager in the hierarchical structure. 265 00:08:45,280 --> 00:08:46,460 So when we think of recursion we 266 00:08:46,600 --> 00:08:48,070 often think of transitive closure 267 00:08:48,550 --> 00:08:51,780 or expanding hierarchies as we've done with our examples so far. 268 00:08:52,450 --> 00:08:53,530 But if we step back for a 269 00:08:53,730 --> 00:08:54,830 second we can see 270 00:08:54,870 --> 00:08:55,980 that there is a quite a bit simpler 271 00:08:56,250 --> 00:08:57,390 way to express the query 272 00:08:57,690 --> 00:08:58,830 that finds the salary burden 273 00:08:59,560 --> 00:09:01,860 of project X. Now 274 00:09:01,980 --> 00:09:03,500 not only is this actually nicer 275 00:09:03,810 --> 00:09:05,340 to look at, it's probably much 276 00:09:05,680 --> 00:09:06,780 more efficient, depending on how 277 00:09:07,140 --> 00:09:08,380 smart the query processor is. 278 00:09:08,860 --> 00:09:10,570 In our previous example, if 279 00:09:10,710 --> 00:09:12,700 the query processor executes the 280 00:09:12,800 --> 00:09:13,780 query in a straight forward way, 281 00:09:14,350 --> 00:09:15,650 it would compute this superior relationship 282 00:09:16,300 --> 00:09:18,320 for the absolute entire company hierarchy 283 00:09:19,290 --> 00:09:20,780 before it figured out 284 00:09:20,970 --> 00:09:22,390 which of those people were involved 285 00:09:22,750 --> 00:09:23,990 in project X. Now, a 286 00:09:24,140 --> 00:09:26,040 really good query processor might actually 287 00:09:26,730 --> 00:09:27,750 figure out to fold in 288 00:09:28,070 --> 00:09:30,000 a project X but, not necessarily. 289 00:09:30,940 --> 00:09:33,950 Here's an example and here's a new formulation of the query. 290 00:09:34,600 --> 00:09:35,510 We're actually going to tie 291 00:09:36,070 --> 00:09:37,790 X specifically to our recursion. 292 00:09:38,800 --> 00:09:40,060 What we're going to compute in 293 00:09:40,220 --> 00:09:41,990 our recursive [xx] statement here. 294 00:09:42,190 --> 00:09:44,100 So this is the temporary relation 295 00:09:44,570 --> 00:09:45,860 we're computing is a relation 296 00:09:46,590 --> 00:09:47,940 containing just a list of 297 00:09:48,220 --> 00:09:49,290 the IDs of the employees 298 00:09:49,800 --> 00:09:51,150 who are involved in project X. 299 00:09:52,130 --> 00:09:53,250 Once we have all the employees 300 00:09:53,520 --> 00:09:56,070 involved in project X, the query down here is trivial. 301 00:09:56,850 --> 00:09:58,450 We just find those employees who 302 00:09:58,600 --> 00:10:01,520 are among the X employees and we sum up their salaries. 303 00:10:02,610 --> 00:10:03,320 So, let's take a look at 304 00:10:03,380 --> 00:10:04,710 the recursive definition here and 305 00:10:05,020 --> 00:10:06,090 again, it's taking the usual 306 00:10:06,390 --> 00:10:07,410 form of a base query, 307 00:10:08,040 --> 00:10:10,550 union and recursive query... and here's what we do. 308 00:10:11,110 --> 00:10:13,100 Well, obviously the manager of 309 00:10:13,330 --> 00:10:15,010 project X is one 310 00:10:15,560 --> 00:10:16,540 of the IDs involved in project 311 00:10:17,030 --> 00:10:18,590 X. So here we find in 312 00:10:18,800 --> 00:10:20,100 the project, the project name text 313 00:10:20,410 --> 00:10:21,440 and we take the manager of that 314 00:10:21,600 --> 00:10:23,870 project and we put that person's ID into Xemps. 315 00:10:24,120 --> 00:10:27,220 That's the first ID that's going to go in there. 316 00:10:27,360 --> 00:10:28,410 That's going to seed the recursion. 317 00:10:29,070 --> 00:10:29,930 That's again the base query. 318 00:10:31,050 --> 00:10:32,270 Then we add in our 319 00:10:32,430 --> 00:10:34,130 recursive step any employee 320 00:10:34,770 --> 00:10:36,280 who is managed by anybody 321 00:10:36,960 --> 00:10:37,950 who's in the X employees. 322 00:10:38,890 --> 00:10:39,980 So, we'll take our manager relationship, 323 00:10:40,580 --> 00:10:42,650 our X employees relationship and 324 00:10:42,760 --> 00:10:45,460 if the employee's manager is an 325 00:10:45,610 --> 00:10:46,710 X, then that employee is 326 00:10:46,840 --> 00:10:48,310 also involved in X. So 327 00:10:48,590 --> 00:10:49,970 we seed the recursion with 328 00:10:50,590 --> 00:10:51,770 the manager of project X 329 00:10:52,110 --> 00:10:53,540 and then we just recursively go 330 00:10:53,810 --> 00:10:55,140 down the tree adding all 331 00:10:55,370 --> 00:10:57,590 of the employees that are underneath one by one. 332 00:10:58,300 --> 00:10:59,100 We don't have to know the depth 333 00:10:59,380 --> 00:11:00,720 of the tree because the recursion will 334 00:11:00,830 --> 00:11:02,780 continue until nobody else is added. 335 00:11:03,130 --> 00:11:05,150 I guess I should have mentioned that earlier in my earlier examples. 336 00:11:06,060 --> 00:11:07,760 Again the recursion sort of 337 00:11:08,300 --> 00:11:09,920 adds a data over and 338 00:11:10,110 --> 00:11:11,190 over again until there's nothing 339 00:11:11,630 --> 00:11:13,270 new to add, and that's when it terminates. 340 00:11:14,170 --> 00:11:15,720 So let's go ahead and run the query. 341 00:11:16,320 --> 00:11:17,690 Anti-climatic again, but we 342 00:11:17,810 --> 00:11:19,540 get the same answer 343 00:11:19,930 --> 00:11:20,740 400 as the salary cost of Project 344 00:11:21,130 --> 00:11:22,520 X. Now we use 345 00:11:22,680 --> 00:11:23,910 the same form of query to 346 00:11:23,940 --> 00:11:25,370 find the total salary cost of 347 00:11:25,470 --> 00:11:26,990 two projects Y and Z. 348 00:11:27,460 --> 00:11:29,110 And that will also demonstrate having two 349 00:11:29,710 --> 00:11:32,390 relations that are defined in the width recursive command. 350 00:11:33,210 --> 00:11:35,140 So I have added project Y 351 00:11:35,270 --> 00:11:36,550 and Z to our project 352 00:11:36,850 --> 00:11:37,960 table, and they're both 353 00:11:38,230 --> 00:11:40,180 managed by employees who 354 00:11:40,260 --> 00:11:42,800 are already in the database, so they're a little lower down the hierarchy. 355 00:11:43,100 --> 00:11:45,590 We should expect those projects have lower total cost. 356 00:11:45,970 --> 00:11:48,390 That's for project X, whose manager was at the root of our hierarchy. 357 00:11:49,650 --> 00:11:51,020 So here's our query, it's a 358 00:11:51,160 --> 00:11:52,330 big one; we're going to 359 00:11:52,440 --> 00:11:54,860 define YM and ZM 360 00:11:55,060 --> 00:11:57,200 exactly as we defined X amps in the previous example. 361 00:11:57,800 --> 00:11:58,740 So Y amps is a table 362 00:11:59,330 --> 00:12:00,730 of a recursively defined relation, 363 00:12:01,160 --> 00:12:02,400 temporary, that's gonna contain 364 00:12:02,830 --> 00:12:04,060 a list of IDs of 365 00:12:04,410 --> 00:12:05,580 the people, the employees that are 366 00:12:05,630 --> 00:12:07,410 involved in Project Y. So 367 00:12:07,550 --> 00:12:08,830 we are going to put the manager of 368 00:12:09,120 --> 00:12:10,290 Project Y as our base 369 00:12:10,640 --> 00:12:11,820 query, and then we're 370 00:12:11,990 --> 00:12:13,480 going to add to it in 371 00:12:13,780 --> 00:12:15,080 the recursion all of the 372 00:12:15,240 --> 00:12:16,710 employees who are managed by 373 00:12:16,910 --> 00:12:18,270 someone who's in the YMs. 374 00:12:19,310 --> 00:12:21,030 And ZM's exactly the same. 375 00:12:21,250 --> 00:12:22,450 We start the manager of project 376 00:12:22,760 --> 00:12:23,830 Z, and then we add 377 00:12:24,320 --> 00:12:25,210 to it, all of the people 378 00:12:25,540 --> 00:12:27,920 are managed transitively down the 379 00:12:27,980 --> 00:12:30,760 tree by someone who's in the ZM's relation. 380 00:12:31,990 --> 00:12:33,770 And then our final query down 381 00:12:34,060 --> 00:12:36,550 here for the statement is a union of two queries. 382 00:12:36,860 --> 00:12:37,710 The first one gets the total 383 00:12:38,130 --> 00:12:39,650 salary for Y and 384 00:12:39,790 --> 00:12:41,190 it labels it as Y total. 385 00:12:41,790 --> 00:12:43,960 So it takes all the Ids that 386 00:12:44,060 --> 00:12:45,860 are in the Y table and from 387 00:12:46,060 --> 00:12:47,780 the employee table get their salaries and sums them up. 388 00:12:47,880 --> 00:12:49,900 And similarly the Z total. 389 00:12:50,700 --> 00:12:51,740 So now we'll run this query, 390 00:12:52,250 --> 00:12:54,150 it will be slightly less is anti-climactic. 391 00:12:55,040 --> 00:12:57,050 We do have now two tuples in our result. 392 00:12:57,870 --> 00:12:58,760 We see that the total salary 393 00:12:59,220 --> 00:13:00,600 for Y is 300 in 394 00:13:00,820 --> 00:13:02,540 the total salaries for Z is 70. 395 00:13:02,730 --> 00:13:04,370 And if you check, cross-check 396 00:13:04,960 --> 00:13:06,110 this result against the data 397 00:13:06,410 --> 00:13:07,430 you'll see that these are 398 00:13:07,540 --> 00:13:08,900 indeed the total salaries 399 00:13:09,770 --> 00:13:10,860 when we take the managers 400 00:13:11,140 --> 00:13:12,410 we specified for projects Y 401 00:13:12,670 --> 00:13:15,490 and Z. And finally our last and most fun example. 402 00:13:16,130 --> 00:13:17,270 The one to find how to 403 00:13:17,370 --> 00:13:18,580 fly from point A to 404 00:13:18,670 --> 00:13:20,000 point be when all we're 405 00:13:20,170 --> 00:13:21,650 concerned about is cost, and 406 00:13:21,710 --> 00:13:23,290 we don't care how many times we have to change planes. 407 00:13:24,440 --> 00:13:25,860 So, here's the little flights table 408 00:13:26,250 --> 00:13:27,770 I've set up, and I 409 00:13:27,940 --> 00:13:28,880 used A and B so 410 00:13:28,990 --> 00:13:30,130 we can literally fly from point 411 00:13:30,360 --> 00:13:31,770 A to point B. All of 412 00:13:31,840 --> 00:13:33,230 our intermediate destinations are actually 413 00:13:33,720 --> 00:13:35,510 real airport codes, and I've 414 00:13:36,050 --> 00:13:37,180 put in some airlines, although they're 415 00:13:37,340 --> 00:13:38,680 not actually going to be used in our query. 416 00:13:39,070 --> 00:13:40,720 And then I've put in the cost of the flights. 417 00:13:41,280 --> 00:13:42,780 You might want to draw 418 00:13:42,930 --> 00:13:44,460 yourself a little graph so 419 00:13:44,610 --> 00:13:45,590 you can see what's going on, 420 00:13:45,590 --> 00:13:46,700 and we can fly from 421 00:13:47,080 --> 00:13:48,200 A to Chicago for 200, 422 00:13:48,540 --> 00:13:50,290 from Chicago to B 423 00:13:50,670 --> 00:13:51,970 for another 100, or we can 424 00:13:52,100 --> 00:13:54,880 go from A to Phoenix and then Phoenix to Las Vegas. 425 00:13:55,190 --> 00:13:57,850 to Las Vegas to oh-oh, I don't remember what this is. 426 00:13:58,280 --> 00:14:01,250 CMH, Detroit, Cincinnati, somewhere in the midwest. 427 00:14:02,260 --> 00:14:03,340 And from there to point 428 00:14:03,600 --> 00:14:04,670 B, or we can take 429 00:14:04,710 --> 00:14:05,870 a non-stop from A to 430 00:14:05,970 --> 00:14:07,370 B on good old Jet Blue for 195. 431 00:14:07,430 --> 00:14:09,220 So clearly we're never 432 00:14:09,490 --> 00:14:10,820 going to be going through Chicago for 433 00:14:10,970 --> 00:14:13,130 a total of 300 with that Jet Blue flight. 434 00:14:13,520 --> 00:14:14,720 but I've set up the 435 00:14:14,790 --> 00:14:16,140 data, as you're probably not 436 00:14:16,350 --> 00:14:17,710 surprised, so that this 437 00:14:18,150 --> 00:14:19,700 long route through Phoenix 438 00:14:20,080 --> 00:14:21,390 and Las Vegas and somewhere in 439 00:14:21,430 --> 00:14:23,690 the midwest is, in fact, gonna be our cheapest way to go. 440 00:14:24,370 --> 00:14:26,180 So now let's take 441 00:14:26,230 --> 00:14:27,250 a look at the recursive query that's 442 00:14:27,510 --> 00:14:28,690 going to find us our 443 00:14:29,010 --> 00:14:30,060 root from point A to 444 00:14:30,140 --> 00:14:30,910 point B, or at least 445 00:14:31,090 --> 00:14:32,330 find us the cheapest way to 446 00:14:32,410 --> 00:14:33,320 get from point A to point 447 00:14:33,620 --> 00:14:34,770 B. So the first 448 00:14:35,030 --> 00:14:36,020 query I'm going to show 449 00:14:36,990 --> 00:14:37,980 actually gives us all the different 450 00:14:38,280 --> 00:14:39,490 costs of getting from A 451 00:14:39,640 --> 00:14:40,610 to B, just so we can see 452 00:14:40,760 --> 00:14:42,040 those enumerated for us, and 453 00:14:42,220 --> 00:14:44,280 then we'll modify the query to give us the cheapest cost. 454 00:14:45,200 --> 00:14:46,800 So here's the recursive query and 455 00:14:47,400 --> 00:14:48,790 we're going to use 456 00:14:48,940 --> 00:14:50,430 a recursively defined relation called root. 457 00:14:51,200 --> 00:14:52,210 And root says that we 458 00:14:52,380 --> 00:14:53,790 can get from an origin to 459 00:14:54,000 --> 00:14:56,370 a destination for a particular total cost. 460 00:14:57,170 --> 00:14:57,170 OK? 461 00:14:57,980 --> 00:14:59,260 So, we again in 462 00:14:59,340 --> 00:15:01,760 our recursion have the base query and the recursive query. 463 00:15:02,630 --> 00:15:04,210 This is exactly what you'd imagine. 464 00:15:05,090 --> 00:15:06,350 We can certainly get from 465 00:15:07,250 --> 00:15:08,250 point X to point Y 466 00:15:08,680 --> 00:15:09,900 for a total cost, if we 467 00:15:10,000 --> 00:15:11,280 can take a direct flight from 468 00:15:11,480 --> 00:15:13,540 point X to point Y for a given cost. 469 00:15:14,290 --> 00:15:15,700 So that's our base query, 470 00:15:16,070 --> 00:15:17,080 we start out with all 471 00:15:17,390 --> 00:15:20,250 of the direct flights in our route relation. 472 00:15:21,300 --> 00:15:22,690 And then we start adding routes 473 00:15:23,110 --> 00:15:26,440 by doing the JOIN of a route with a additional flight. 474 00:15:27,160 --> 00:15:28,530 So basically what this 475 00:15:29,210 --> 00:15:30,750 join here says, "If I 476 00:15:31,000 --> 00:15:32,960 can get from the origin 477 00:15:33,510 --> 00:15:35,390 in a route to the destination and 478 00:15:35,490 --> 00:15:36,880 then that destination is the 479 00:15:37,040 --> 00:15:38,490 origin of another flight, then 480 00:15:38,690 --> 00:15:40,040 I can add that flight. 481 00:15:40,240 --> 00:15:41,400 I can start with my original origin, 482 00:15:42,210 --> 00:15:43,640 final destinationand the cost 483 00:15:44,060 --> 00:15:46,590 of that is going to be the total that I already had. 484 00:15:47,670 --> 00:15:50,610 Plus the cost of the new flight and that's my new total. 485 00:15:51,290 --> 00:15:52,940 So again, this is another 486 00:15:53,350 --> 00:15:55,090 transitive closer like recursion. 487 00:15:55,670 --> 00:15:57,810 It's very similar to the ancestor recursion. 488 00:15:58,750 --> 00:16:01,640 Very similar to expanding the company hierarchy. 489 00:16:01,960 --> 00:16:03,400 The only real difference here 490 00:16:03,590 --> 00:16:06,560 is that we're also accumulating these costs as we do the recursion. 491 00:16:07,540 --> 00:16:08,810 So once we've done this recursion 492 00:16:09,020 --> 00:16:10,550 then we have a 493 00:16:10,910 --> 00:16:12,970 complete specification of all 494 00:16:13,350 --> 00:16:15,380 the roots within our 495 00:16:15,760 --> 00:16:16,710 flights database, all of the 496 00:16:17,800 --> 00:16:18,640 way's we can get from 497 00:16:18,880 --> 00:16:20,400 A to B and the total cost. 498 00:16:20,750 --> 00:16:21,770 Now one thing I should say 499 00:16:22,040 --> 00:16:23,280 is this is not actually giving 500 00:16:23,580 --> 00:16:25,890 us the ways of getting from one place to another. 501 00:16:26,710 --> 00:16:28,100 If we wanted to accumulate 502 00:16:28,620 --> 00:16:30,090 the actual route that we 503 00:16:30,210 --> 00:16:31,650 take so the flights and 504 00:16:31,860 --> 00:16:33,130 the costs and the airlines and 505 00:16:33,270 --> 00:16:34,510 so on, we have to 506 00:16:34,740 --> 00:16:37,210 kind of use a structured structure 507 00:16:37,760 --> 00:16:39,500 inside our data base to accumulate those. 508 00:16:39,650 --> 00:16:41,190 There are ways of doing that 509 00:16:41,330 --> 00:16:42,690 but I'm not going to demonstrate that here. 510 00:16:43,040 --> 00:16:44,270 I'm just going to demonstrate the is 511 00:16:44,360 --> 00:16:46,740 accused of recursion and computing the total cost. 512 00:16:48,190 --> 00:16:49,290 OK, so let's go ahead 513 00:16:49,810 --> 00:16:50,990 and run this query so we've 514 00:16:51,340 --> 00:16:52,490 computed all of the 515 00:16:52,550 --> 00:16:53,470 routes, and then we're just 516 00:16:53,770 --> 00:16:55,080 gonna start by finding the 517 00:16:55,190 --> 00:16:56,130 routes from A to B 518 00:16:56,370 --> 00:16:58,480 and what the total cost of those are. 519 00:16:58,510 --> 00:16:59,730 So we'll run the query and we'll 520 00:17:00,060 --> 00:17:01,110 find out that there are three 521 00:17:01,370 --> 00:17:02,680 ways of getting from A 522 00:17:02,890 --> 00:17:04,190 to B. The first one 523 00:17:04,320 --> 00:17:06,180 happens to be that direct Jet Blue flight for 195. 524 00:17:06,370 --> 00:17:08,020 The second was the 525 00:17:08,150 --> 00:17:10,220 flight through Chicago for a total cost of 300. 526 00:17:10,350 --> 00:17:12,890 You can go back and look at the data and verify these. 527 00:17:13,650 --> 00:17:14,530 And the third one was that 528 00:17:14,620 --> 00:17:15,720 complicated routing where we 529 00:17:15,840 --> 00:17:17,190 stopped several times, but we 530 00:17:17,340 --> 00:17:18,410 save a lot of money, well 531 00:17:18,840 --> 00:17:19,900 twenty dollars over the direct 532 00:17:20,260 --> 00:17:22,210 flight, by going 533 00:17:22,610 --> 00:17:24,490 through those cities because the total sub-cost is 175. 534 00:17:25,130 --> 00:17:26,250 I'll leave it up to 535 00:17:26,380 --> 00:17:27,720 you whether it's worth twenty dollars 536 00:17:28,220 --> 00:17:30,740 to stop several times versus the direct flight. 537 00:17:31,760 --> 00:17:33,530 So now since my actual specification 538 00:17:34,350 --> 00:17:35,200 of what I wanted to know was 539 00:17:35,400 --> 00:17:36,630 the cheapest way to go then 540 00:17:36,810 --> 00:17:38,420 I just say min total instead 541 00:17:38,780 --> 00:17:40,070 of in my final 542 00:17:40,520 --> 00:17:42,720 query and I run 543 00:17:42,910 --> 00:17:43,980 that and my answer is 544 00:17:44,070 --> 00:17:44,850 that 175 is the cheapest way to get 545 00:17:44,990 --> 00:17:46,440 from A to B. Now here 546 00:17:46,740 --> 00:17:48,440 is an alternative formulation of the 547 00:17:48,510 --> 00:17:50,270 same query That essentially parallels 548 00:17:50,990 --> 00:17:52,410 the alternative that we looked 549 00:17:52,830 --> 00:17:54,770 at with our project cross where 550 00:17:54,940 --> 00:17:56,350 we built in project X 551 00:17:56,710 --> 00:17:59,020 into our recursion that simplified 552 00:17:59,610 --> 00:18:00,700 the recursion in that case. 553 00:18:01,350 --> 00:18:04,060 In this case it's not simpler, but it could potentially be more efficient. 554 00:18:04,670 --> 00:18:05,510 We're going to build in the 555 00:18:05,680 --> 00:18:07,400 fact that we're starting from origin 556 00:18:07,960 --> 00:18:09,310 A so instead of 557 00:18:09,560 --> 00:18:11,010 finding all roots from any 558 00:18:11,220 --> 00:18:12,230 point to any other point 559 00:18:12,490 --> 00:18:13,970 in our recursion and then finding 560 00:18:14,440 --> 00:18:15,860 the roots from A to 561 00:18:15,950 --> 00:18:17,550 B, let's create a 562 00:18:17,600 --> 00:18:19,050 relation recursively that says, 563 00:18:19,480 --> 00:18:21,100 "Starting from point A, 564 00:18:21,830 --> 00:18:22,590 I can get to a particular 565 00:18:23,260 --> 00:18:25,150 destination for a particular total cost. 566 00:18:25,960 --> 00:18:26,850 So this is going to build 567 00:18:27,310 --> 00:18:28,940 up roots starting from A 568 00:18:29,590 --> 00:18:30,710 the base query is going 569 00:18:31,030 --> 00:18:32,170 to, of course, start by looking 570 00:18:32,470 --> 00:18:34,130 at direct flights where the 571 00:18:34,330 --> 00:18:35,610 origin is A and is 572 00:18:35,730 --> 00:18:37,050 going to put the destination and the 573 00:18:37,260 --> 00:18:39,230 cost into our relation called 574 00:18:39,640 --> 00:18:40,950 from A. So that starts 575 00:18:41,320 --> 00:18:42,390 with that first gives us 576 00:18:42,580 --> 00:18:43,880 direct start from A where 577 00:18:43,950 --> 00:18:45,120 we can get on the direct, where 578 00:18:45,400 --> 00:18:46,550 we can get to, and how 579 00:18:46,740 --> 00:18:47,740 much it will cost us, and 580 00:18:47,950 --> 00:18:49,350 then our recursion is going 581 00:18:49,650 --> 00:18:51,300 to add flights to that one. 582 00:18:51,670 --> 00:18:52,970 Again, it really parallels what 583 00:18:53,120 --> 00:18:54,520 we did with the Project X. 584 00:18:55,220 --> 00:18:56,330 Our recursion is going to 585 00:18:56,430 --> 00:18:57,610 say, ok we know we 586 00:18:57,760 --> 00:19:00,140 can get a, from particular, 587 00:19:01,260 --> 00:19:02,350 we can get to a particular 588 00:19:02,620 --> 00:19:04,980 place from point A for a certain cost and that's our destination. 589 00:19:06,050 --> 00:19:07,580 If we add 1 more flight so 590 00:19:07,850 --> 00:19:09,000 the origin of that flight 591 00:19:09,370 --> 00:19:10,550 is the destination of where 592 00:19:10,590 --> 00:19:12,040 we can get then that will 593 00:19:12,260 --> 00:19:13,890 also be a destination that 594 00:19:14,060 --> 00:19:15,210 we can get to from point 595 00:19:15,460 --> 00:19:16,700 A and we'll just add the 596 00:19:16,930 --> 00:19:18,540 cost of the additional flight on. 597 00:19:19,360 --> 00:19:21,210 One more time a 598 00:19:21,300 --> 00:19:23,770 strong suggestion that you download and try these things for yourself. 599 00:19:24,870 --> 00:19:26,500 Once we found all the 600 00:19:26,690 --> 00:19:28,090 places we can get from A 601 00:19:28,850 --> 00:19:30,200 then we'll use that to 602 00:19:30,370 --> 00:19:31,450 figure out the cheapest way to 603 00:19:31,620 --> 00:19:32,730 get to Point B. But let's 604 00:19:32,970 --> 00:19:35,010 just start by running the 605 00:19:35,090 --> 00:19:36,010 With statement where all we 606 00:19:36,140 --> 00:19:37,190 do is see the places we can 607 00:19:37,320 --> 00:19:39,570 get to from A and the total cost of getting there. 608 00:19:40,190 --> 00:19:40,700 So here we go. 609 00:19:41,400 --> 00:19:42,480 And we can get to Chicago, 610 00:19:43,180 --> 00:19:44,300 Phoenix or we can get 611 00:19:44,410 --> 00:19:45,490 to B a couple of different 612 00:19:45,760 --> 00:19:46,950 ways, three different ways 613 00:19:47,210 --> 00:19:48,790 actually as we already know. 614 00:19:49,090 --> 00:19:50,070 We can also get to 615 00:19:50,180 --> 00:19:51,170 Las Vegas and this mysterious 616 00:19:51,960 --> 00:19:53,480 CMH I wish I remembered what it were. 617 00:19:54,400 --> 00:19:56,000 So now if we're interested in 618 00:19:56,360 --> 00:19:57,450 finding the cheapest way to get 619 00:19:57,550 --> 00:19:59,200 from A to B then We'll 620 00:19:59,370 --> 00:20:01,390 add where the destination equals 621 00:20:01,870 --> 00:20:03,860 B on here and we'll 622 00:20:04,380 --> 00:20:06,260 add the minimum total cost 623 00:20:06,760 --> 00:20:08,100 and hopefully that will 624 00:20:08,310 --> 00:20:11,020 be our good old 175 and indeed it is. 625 00:20:12,180 --> 00:20:14,880 By the way we can do the same basic idea but backwards. 626 00:20:15,850 --> 00:20:16,970 Instead of finding all the 627 00:20:17,130 --> 00:20:18,310 places that we can get from 628 00:20:18,890 --> 00:20:20,310 city A how about if 629 00:20:20,470 --> 00:20:21,720 we find all the places from 630 00:20:22,170 --> 00:20:23,250 which we can get to city 631 00:20:23,520 --> 00:20:25,980 B. So here's the query that does that. 632 00:20:26,760 --> 00:20:28,180 To B is going to 633 00:20:28,280 --> 00:20:30,110 be our recursively define relation that's 634 00:20:30,330 --> 00:20:31,250 going to give us the origin, 635 00:20:31,830 --> 00:20:33,010 the place from which we 636 00:20:33,150 --> 00:20:34,170 can get to B and the total 637 00:20:34,530 --> 00:20:36,670 cost of getting to B from that place. 638 00:20:37,600 --> 00:20:40,030 So again, the structure is exactly parallel. 639 00:20:40,440 --> 00:20:41,250 We start out with our base 640 00:20:41,530 --> 00:20:42,450 query saying if we have 641 00:20:42,860 --> 00:20:43,810 a direct flight to B 642 00:20:44,500 --> 00:20:45,640 then we can get from the 643 00:20:45,730 --> 00:20:46,740 origin of that direct flight 644 00:20:47,230 --> 00:20:48,090 at the cost of the flight to 645 00:20:48,290 --> 00:20:49,290 B and we then 646 00:20:49,440 --> 00:20:51,670 recursively add flights on 647 00:20:51,980 --> 00:20:52,750 that you can think of if your 648 00:20:53,130 --> 00:20:54,130 going from left to right 649 00:20:54,820 --> 00:20:56,070 adding flights from the left. 650 00:20:56,630 --> 00:20:57,850 So if we know we 651 00:20:58,080 --> 00:20:59,090 can get from a place to B 652 00:20:59,900 --> 00:21:01,430 and then we can go from...take 653 00:21:02,390 --> 00:21:03,450 a direct flight from somewhere 654 00:21:03,710 --> 00:21:06,670 else to that place And we can get from that somewhere else to be. 655 00:21:07,800 --> 00:21:08,990 Anyway so we do that 656 00:21:09,290 --> 00:21:10,400 again by joining so we're 657 00:21:10,640 --> 00:21:13,080 going to take our origin 658 00:21:13,600 --> 00:21:14,390 from which we can get to B 659 00:21:15,000 --> 00:21:16,070 we're going to find flight that take 660 00:21:16,350 --> 00:21:17,510 us to that origin, we're 661 00:21:17,640 --> 00:21:18,560 going to add the cost of that 662 00:21:18,710 --> 00:21:19,490 flight and that gives us 663 00:21:19,570 --> 00:21:20,680 a new way to get 664 00:21:20,950 --> 00:21:22,390 to B. And then let 665 00:21:22,530 --> 00:21:23,680 me start by just writing the 666 00:21:23,810 --> 00:21:25,630 query that all of 667 00:21:25,710 --> 00:21:28,110 the places from which we can get to B and the cost of getting there. 668 00:21:28,520 --> 00:21:29,050 We'll run the query. 669 00:21:29,750 --> 00:21:30,750 And we can see that we 670 00:21:30,870 --> 00:21:32,150 can get to B from point 671 00:21:32,420 --> 00:21:33,960 A in 3 different ways and 672 00:21:34,160 --> 00:21:36,380 from our other cities in our database as well. 673 00:21:37,080 --> 00:21:38,060 Similarly, to what we did 674 00:21:38,230 --> 00:21:39,440 previously, if we're particularly 675 00:21:40,180 --> 00:21:41,730 interested in getting from A 676 00:21:41,880 --> 00:21:43,050 to B, whoops, let's make that 677 00:21:43,180 --> 00:21:45,220 our origin then we add 678 00:21:45,630 --> 00:21:47,450 where origin equals a and 679 00:21:47,820 --> 00:21:49,180 if we want the minimum it would 680 00:21:49,330 --> 00:21:50,740 be our minimum total again paralleling 681 00:21:51,480 --> 00:21:52,250 exactly what we did before. 682 00:21:53,010 --> 00:21:55,400 We run it and good old 175 comes out. 683 00:21:56,420 --> 00:21:57,280 Now we're going to have 684 00:21:57,410 --> 00:21:58,600 some real fun because I 685 00:21:58,860 --> 00:22:00,130 added another flight to our 686 00:22:00,260 --> 00:22:01,550 database, and this flight 687 00:22:01,880 --> 00:22:03,660 takes us from Columbus, I 688 00:22:03,860 --> 00:22:05,640 now know its Columbus, to Phoenix 689 00:22:06,960 --> 00:22:08,710 and creates a loop in our flights. 690 00:22:09,560 --> 00:22:10,540 So that means that we 691 00:22:10,920 --> 00:22:13,110 can fly from A to 692 00:22:13,780 --> 00:22:15,060 B next to Las Vegas, to 693 00:22:15,430 --> 00:22:16,750 Columbus, back to Phoenix 694 00:22:17,530 --> 00:22:18,870 and then to Las Vegas and Columbus again. 695 00:22:19,580 --> 00:22:20,500 So, we're going to have 696 00:22:20,930 --> 00:22:23,190 arbitrarily, actually unbounded, actually 697 00:22:24,070 --> 00:22:26,420 infinite, length routes that we can take now. 698 00:22:26,910 --> 00:22:28,410 Now, obviously those routes aren't 699 00:22:28,620 --> 00:22:29,610 going to ever be the cheapest 700 00:22:30,040 --> 00:22:31,180 way because as we take 701 00:22:31,450 --> 00:22:32,470 those roots it's going to 702 00:22:32,600 --> 00:22:33,410 get more and more expensive, 703 00:22:33,920 --> 00:22:36,940 none of them are negative costs paying us to take flights. 704 00:22:37,220 --> 00:22:38,470 But if we just do 705 00:22:38,730 --> 00:22:40,220 our naive recursion where we 706 00:22:40,380 --> 00:22:41,450 generate all of our roots 707 00:22:41,850 --> 00:22:43,330 before we take a 708 00:22:43,500 --> 00:22:44,930 look at our final query then 709 00:22:45,230 --> 00:22:47,490 we're going to be generating an infinite number of roots. 710 00:22:48,630 --> 00:22:49,870 So here's our original enquiry, the 711 00:22:49,990 --> 00:22:51,200 first one we wrote, where we 712 00:22:51,440 --> 00:22:53,110 were just finding all of the 713 00:22:53,850 --> 00:22:55,160 costs of getting from A 714 00:22:55,370 --> 00:22:56,850 to B by computing all of 715 00:22:56,960 --> 00:22:57,970 the roots in the entire database 716 00:22:58,460 --> 00:22:59,420 and then looking at those from 717 00:22:59,620 --> 00:23:01,210 A to B. Now with 718 00:23:01,400 --> 00:23:02,740 our additional flight that creates 719 00:23:03,230 --> 00:23:04,420 a loop we run this command 720 00:23:05,850 --> 00:23:06,290 and nothing happens. 721 00:23:07,130 --> 00:23:09,350 Actually if we wait long enough we're going to get an error. 722 00:23:10,560 --> 00:23:12,060 Well, okay we waited for a while. 723 00:23:12,920 --> 00:23:14,510 appears that the user 724 00:23:14,890 --> 00:23:16,830 interface we're using isn't going to show us the error. 725 00:23:17,310 --> 00:23:18,400 But if you try running this in 726 00:23:18,610 --> 00:23:20,190 post risk command line interface, 727 00:23:20,630 --> 00:23:21,590 I assure you if you 728 00:23:21,660 --> 00:23:23,070 wait long enough, eventually it 729 00:23:23,180 --> 00:23:25,120 will tell you that the recursion effectively overflowed. 730 00:23:26,040 --> 00:23:27,290 So it is trying to compute 731 00:23:27,660 --> 00:23:29,080 this unbounded number of routes 732 00:23:29,860 --> 00:23:30,980 in the recursive part of the 733 00:23:31,040 --> 00:23:32,420 with statement and never even 734 00:23:32,670 --> 00:23:34,130 gets to the query that we want to execute. 735 00:23:35,530 --> 00:23:37,590 OK, here's my first attempt at fixing the problem. 736 00:23:38,190 --> 00:23:39,360 We know that we're never going 737 00:23:39,600 --> 00:23:41,490 to want to take a arbitrarily long route. 738 00:23:41,610 --> 00:23:42,470 We're never going to want to go 739 00:23:42,570 --> 00:23:43,880 around a cycle lots of 740 00:23:44,020 --> 00:23:45,210 times as our cheapest way 741 00:23:45,450 --> 00:23:46,290 to get from point A to point 742 00:23:46,630 --> 00:23:48,040 B so what I've 743 00:23:48,250 --> 00:23:49,040 done here, and I'm not going 744 00:23:49,230 --> 00:23:49,990 to go into this in great 745 00:23:50,250 --> 00:23:51,060 detail, but I have added 746 00:23:51,370 --> 00:23:53,030 a condition in the recursion 747 00:23:53,890 --> 00:23:55,100 that says "I'm only going 748 00:23:55,360 --> 00:23:57,110 to add a new route, a 749 00:23:57,150 --> 00:23:58,370 new route into my recursively 750 00:23:58,870 --> 00:24:00,400 defined route table when the 751 00:24:00,890 --> 00:24:01,860 total cost of that route, and 752 00:24:02,040 --> 00:24:02,840 that's defined as the cost 753 00:24:03,370 --> 00:24:04,300 plus total here when we 754 00:24:04,400 --> 00:24:06,360 added, is less then 755 00:24:06,690 --> 00:24:07,770 all of the ways we can 756 00:24:07,990 --> 00:24:10,330 already get from that place, 757 00:24:10,650 --> 00:24:11,910 to that origin to that destination. 758 00:24:12,900 --> 00:24:13,530 So in other words, I'm only 759 00:24:13,910 --> 00:24:15,290 going to add cheaper routes 760 00:24:15,770 --> 00:24:17,010 than the ones that are already there. 761 00:24:17,760 --> 00:24:18,600 And by the way, if there 762 00:24:18,770 --> 00:24:20,860 are no routes already from 763 00:24:21,000 --> 00:24:22,150 the origin to the destination then this 764 00:24:22,310 --> 00:24:23,270 will be satisfied and we will 765 00:24:23,550 --> 00:24:26,520 add that first route, then after that only adding cheaper ones. 766 00:24:27,300 --> 00:24:29,170 So let's try running this query and see what happens. 767 00:24:30,410 --> 00:24:31,240 Well, we got an error. 768 00:24:31,420 --> 00:24:33,360 Now this is not a runtime execution error. 769 00:24:33,580 --> 00:24:34,820 This is actually an error 770 00:24:35,070 --> 00:24:36,260 that says we're not allowed 771 00:24:36,690 --> 00:24:38,000 to refer to our recursively 772 00:24:38,800 --> 00:24:40,440 defined relation in a sub 773 00:24:40,820 --> 00:24:42,430 query within our recursion. 774 00:24:43,690 --> 00:24:45,170 The SQL standard actually might 775 00:24:45,560 --> 00:24:47,810 allow this particular use, but 776 00:24:48,170 --> 00:24:50,390 I don't know that any implementation actually handles it. 777 00:24:50,470 --> 00:24:51,660 It can be fairly difficult 778 00:24:52,180 --> 00:24:53,100 to handle a case where 779 00:24:53,250 --> 00:24:54,650 you have the recursively defined 780 00:24:54,960 --> 00:24:57,540 relation in a subquery as well as in the outer query here. 781 00:24:58,210 --> 00:25:00,170 So that's obviously not going to solve our problem. 782 00:25:01,580 --> 00:25:02,850 Now there's actually a feature of 783 00:25:03,170 --> 00:25:05,220 Basic SQL that can help us here with our problem. 784 00:25:06,030 --> 00:25:06,790 There's something called Limit. 785 00:25:07,220 --> 00:25:08,360 We actually didn't discuss this 786 00:25:08,530 --> 00:25:09,530 in the SQL videos, but that 787 00:25:09,750 --> 00:25:11,660 says just give us this number of results. 788 00:25:12,870 --> 00:25:14,430 So let's say that we're 789 00:25:14,650 --> 00:25:16,200 going to have our recursion here 790 00:25:16,450 --> 00:25:17,880 for the roots, but down here 791 00:25:17,930 --> 00:25:18,920 we're going to say I only need 792 00:25:19,280 --> 00:25:21,310 up to 20 results for how 793 00:25:21,540 --> 00:25:22,590 I get from point A to 794 00:25:22,660 --> 00:25:24,020 point B. And the posary 795 00:25:24,240 --> 00:25:25,670 system actually makes use of 796 00:25:25,780 --> 00:25:26,910 the limit command in the final 797 00:25:27,180 --> 00:25:28,750 query to restrict the recursion. 798 00:25:29,350 --> 00:25:30,230 It's a nice feature, and it 799 00:25:30,360 --> 00:25:32,700 was added specifically for this 800 00:25:32,910 --> 00:25:35,100 problem of possibly infinite 801 00:25:35,380 --> 00:25:36,610 recursions where we actually don't 802 00:25:36,900 --> 00:25:39,540 want it to be infinite because we only need a finite number of answers. 803 00:25:40,410 --> 00:25:41,590 Okay, so, let's go with that 804 00:25:41,660 --> 00:25:43,780 here and we'll see that Ah, great. 805 00:25:44,510 --> 00:25:45,260 Everything worked well. 806 00:25:45,850 --> 00:25:47,030 We got our roots from A 807 00:25:47,250 --> 00:25:48,780 to B. And I do have 808 00:25:49,060 --> 00:25:50,660 20 roots, I mean, so they're getting very expensive. 809 00:25:51,260 --> 00:25:52,170 Down here I'm going to go 810 00:25:52,270 --> 00:25:53,220 around and around the mid west 811 00:25:53,630 --> 00:25:54,720 while lots and lots of 812 00:25:54,850 --> 00:25:56,420 times but that did 813 00:25:56,560 --> 00:25:57,570 the limit the recursion it did 814 00:25:57,880 --> 00:25:59,140 stop unlike our query 815 00:25:59,430 --> 00:26:01,670 where we didn't have the limit and it just went on indefinitely. 816 00:26:02,960 --> 00:26:04,550 So that looks pretty good, 817 00:26:04,950 --> 00:26:07,200 with one unfortunate problem, 818 00:26:07,710 --> 00:26:08,870 which is if we still want 819 00:26:09,120 --> 00:26:11,080 the minimum, we're going to 820 00:26:11,480 --> 00:26:14,550 again get a infinite execution. 821 00:26:15,350 --> 00:26:16,800 So the old result 822 00:26:17,160 --> 00:26:18,420 is still sitting here but now 823 00:26:18,820 --> 00:26:20,840 the system is chunking on because 824 00:26:21,460 --> 00:26:22,840 the limit here is applied to 825 00:26:22,980 --> 00:26:24,570 this min to the 826 00:26:24,650 --> 00:26:27,570 number of tupimit the recursion, 827 00:26:28,120 --> 00:26:29,990 we're always going to get only one tuple in our result. 828 00:26:30,350 --> 00:26:31,290 So even if we said limit one 829 00:26:31,610 --> 00:26:33,210 here, we'd still get the 830 00:26:33,490 --> 00:26:34,640 infinite behavior, so we haven't 831 00:26:35,270 --> 00:26:36,610 quite solved our problem. 832 00:26:37,860 --> 00:26:39,130 Okay, so, here's what we're going to do. 833 00:26:40,470 --> 00:26:41,650 Aesthetically, maybe it's not 834 00:26:41,870 --> 00:26:43,400 the absolutely best solution but 835 00:26:43,510 --> 00:26:45,380 I'm going to argue that it's a pretty reasonable one. 836 00:26:46,240 --> 00:26:48,070 We tried limiting our 837 00:26:48,330 --> 00:26:49,600 recursion to only add 838 00:26:50,160 --> 00:26:51,230 new routes that were cheaper 839 00:26:51,750 --> 00:26:54,780 than existing routes to and from the same place. 840 00:26:55,380 --> 00:26:56,270 We weren't allowed to do that 841 00:26:56,550 --> 00:26:58,380 syntactically the recursive with 842 00:26:58,630 --> 00:26:59,950 statement didn't allow the sub 843 00:27:00,190 --> 00:27:02,150 query with the recursively defined relation in it. 844 00:27:02,830 --> 00:27:03,750 So we're going to do a 845 00:27:03,980 --> 00:27:05,570 different change here where 846 00:27:05,840 --> 00:27:07,020 we're not going to add 847 00:27:07,250 --> 00:27:08,460 new roots to our flight 848 00:27:09,140 --> 00:27:10,910 when the length of the 849 00:27:11,060 --> 00:27:11,900 root, in other words the number 850 00:27:12,200 --> 00:27:13,430 of flights contributing to that 851 00:27:13,740 --> 00:27:15,380 root, is greater than 852 00:27:16,080 --> 00:27:16,920 or equal to ten. 853 00:27:17,680 --> 00:27:18,770 So how do we do that? 854 00:27:19,330 --> 00:27:20,470 We're going to add to our 855 00:27:20,600 --> 00:27:22,350 recursively defined relation route 856 00:27:23,010 --> 00:27:25,970 the origin, destination and total cost of that. 857 00:27:26,320 --> 00:27:27,660 And then we are going to add the length. 858 00:27:28,570 --> 00:27:29,450 And so that's going to put 859 00:27:29,630 --> 00:27:30,890 in each root tupple, how 860 00:27:31,310 --> 00:27:33,030 many flights were involved in the root. 861 00:27:33,750 --> 00:27:35,200 So let's see how we do that with our recursion. 862 00:27:35,690 --> 00:27:36,720 We still have the base case 863 00:27:37,090 --> 00:27:39,600 here and the recursively defined union. 864 00:27:40,520 --> 00:27:41,700 In our base case, we're going 865 00:27:41,950 --> 00:27:43,010 to be adding to our route 866 00:27:43,370 --> 00:27:45,320 the non-stop flights, so 867 00:27:45,640 --> 00:27:46,970 we'll have exactly what thought 868 00:27:47,060 --> 00:27:48,010 we had before, and then we'll 869 00:27:48,080 --> 00:27:49,540 have the constant one to say 870 00:27:49,830 --> 00:27:51,670 that this non-stop flight is just one flight. 871 00:27:52,550 --> 00:27:53,370 Then when we do our recursion, 872 00:27:54,130 --> 00:27:55,950 we're joining our route relation 873 00:27:56,390 --> 00:27:57,540 that we're building up by extending 874 00:27:58,130 --> 00:27:59,460 it with an additional flight exactly 875 00:28:00,120 --> 00:28:02,970 as before, but there is two changes here. 876 00:28:03,780 --> 00:28:04,490 One of them is that we're 877 00:28:04,590 --> 00:28:05,730 going to compute our new length 878 00:28:06,240 --> 00:28:07,450 by adding one to 879 00:28:07,560 --> 00:28:08,870 the existing length of the 880 00:28:09,130 --> 00:28:11,210 root for our new root because we're adding one flight. 881 00:28:11,930 --> 00:28:13,190 And then we're only going to 882 00:28:13,720 --> 00:28:15,040 add tupples to the root 883 00:28:15,260 --> 00:28:16,970 relation when the length of 884 00:28:17,270 --> 00:28:18,460 the route that we're adding is 885 00:28:18,630 --> 00:28:20,120 less than ten. 886 00:28:20,260 --> 00:28:21,220 So now let's see what happens. 887 00:28:21,680 --> 00:28:22,720 I'm going to start again by 888 00:28:22,940 --> 00:28:24,190 looking at all of the 889 00:28:24,280 --> 00:28:25,400 costs of getting from point 890 00:28:25,700 --> 00:28:28,520 A to point B and then we'll take to look at finding the least. 891 00:28:29,290 --> 00:28:30,450 So, we'll go ahead and execute 892 00:28:30,920 --> 00:28:32,160 the query and we see 893 00:28:32,510 --> 00:28:33,970 that we have, one, 894 00:28:34,190 --> 00:28:35,530 two, three, four, five ways of 895 00:28:35,720 --> 00:28:36,610 getting from A to B, 896 00:28:36,800 --> 00:28:38,380 where the length of the number 897 00:28:38,650 --> 00:28:40,770 of flights involved is less than or equal to 10. 898 00:28:41,480 --> 00:28:43,100 We see our friends here. 899 00:28:43,400 --> 00:28:44,700 This was the nonstop flight. 900 00:28:45,060 --> 00:28:45,760 This was the one through Boston. 901 00:28:46,660 --> 00:28:47,910 Here's our favorite one and 902 00:28:48,050 --> 00:28:49,100 there's a few more so these 903 00:28:49,340 --> 00:28:50,450 are going to go through that 904 00:28:50,820 --> 00:28:52,120 cycle a couple of times. 905 00:28:52,730 --> 00:28:53,810 But once we get to 906 00:28:53,850 --> 00:28:54,790 the length of ten, we're not going 907 00:28:55,030 --> 00:28:56,070 to add any more, so we've 908 00:28:56,350 --> 00:28:58,130 got termination and if we 909 00:28:58,210 --> 00:28:59,040 want to change it to 910 00:28:59,220 --> 00:29:00,720 find the minimum cost flight, 911 00:29:01,130 --> 00:29:02,030 then it's just the min total 912 00:29:02,440 --> 00:29:05,000 as before, and we'll find good old 175. 913 00:29:05,870 --> 00:29:07,750 Now what 's unaesthetic 914 00:29:08,000 --> 00:29:08,930 about this, is that we're 915 00:29:09,230 --> 00:29:11,320 actually limiting the amount of recursion. 916 00:29:11,910 --> 00:29:13,160 So, the whole point of writing 917 00:29:13,430 --> 00:29:14,650 recursive queries is when we 918 00:29:14,860 --> 00:29:16,870 don't know the minimum 919 00:29:17,270 --> 00:29:19,520 number of computations that we need to do to get our answer. 920 00:29:20,270 --> 00:29:21,530 So maybe it would so 921 00:29:21,760 --> 00:29:24,000 happen to turn out that 922 00:29:24,140 --> 00:29:25,300 more than ten flights were required 923 00:29:25,860 --> 00:29:27,120 to get the cheapest and, if 924 00:29:27,400 --> 00:29:29,100 that was the case, then we wouldn't get our right answer. 925 00:29:29,880 --> 00:29:30,760 Of course, we could change it to 926 00:29:31,300 --> 00:29:32,310 a hundred and we'd still 927 00:29:32,590 --> 00:29:33,830 get the one seventy five. 928 00:29:34,520 --> 00:29:35,490 And, you know, honestly we could 929 00:29:35,820 --> 00:29:37,060 change it to 10,000 and you 930 00:29:37,340 --> 00:29:38,560 can't see it happening here 931 00:29:38,670 --> 00:29:40,490 but it is actually recomputing that 932 00:29:40,680 --> 00:29:42,330 175 even when I put in 10,000. 933 00:29:42,450 --> 00:29:44,800 I can even do a 934 00:29:44,820 --> 00:29:47,100 100,000 and still going to work for me. 935 00:29:47,600 --> 00:29:49,230 So, if we 936 00:29:49,390 --> 00:29:50,900 presume that nobody wants to 937 00:29:51,000 --> 00:29:52,380 take more than a 100,000 938 00:29:52,560 --> 00:29:54,470 flights in order to get 939 00:29:54,620 --> 00:29:55,580 from Point A to Point B 940 00:29:55,600 --> 00:29:56,740 in the cheapest fashion then, this 941 00:29:57,020 --> 00:29:58,370 would be a reasonable way to 942 00:29:58,580 --> 00:30:00,890 bound the recursion and get the answer that we want. 943 00:30:01,220 --> 00:30:03,240 Even in the presence of cycles in our relation.