1 00:00:00,570 --> 00:00:03,330 In this video, we'll show how recursion has been added to the SQL language. 2 00:00:04,310 --> 00:00:06,910 We'll show the WITH statement which is a regular part of SQL. 3 00:00:07,260 --> 00:00:10,020 And then we'll show how WITH can be used to write recursive queries. 4 00:00:11,060 --> 00:00:12,270 We'll describe a few examples 5 00:00:12,880 --> 00:00:14,150 that can't be written without recursion 6 00:00:14,850 --> 00:00:15,990 in SQL, and then in a 7 00:00:16,270 --> 00:00:17,120 follow-up video, we'll give a 8 00:00:17,180 --> 00:00:18,880 demo that will show the examples in action. 9 00:00:20,650 --> 00:00:22,260 So SQL is not what is 10 00:00:22,420 --> 00:00:23,980 known as a turing complete language. 11 00:00:24,710 --> 00:00:25,990 For those of you who are 12 00:00:26,320 --> 00:00:27,610 familiar with the idea of 13 00:00:27,750 --> 00:00:29,270 turing completeness, it says 14 00:00:29,570 --> 00:00:30,900 that pretty much any computation 15 00:00:31,330 --> 00:00:32,440 can be performed by a language, 16 00:00:32,660 --> 00:00:34,020 and that's simply not true in SQL. 17 00:00:34,890 --> 00:00:35,910 So SQL has some nice features, 18 00:00:36,420 --> 00:00:38,270 it's simple, convenient, declarative--meaning 19 00:00:39,590 --> 00:00:41,010 we don't have to say how 20 00:00:41,150 --> 00:00:43,300 to execute queries, just what we want out of the queries. 21 00:00:44,020 --> 00:00:45,840 We've talked about that many times throughout these videos. 22 00:00:46,780 --> 00:00:47,900 We find that SQL 23 00:00:48,240 --> 00:00:49,640 is expressive enough for most 24 00:00:50,170 --> 00:00:51,580 all database queries we want 25 00:00:51,880 --> 00:00:53,730 to do, except for one type. 26 00:00:54,420 --> 00:00:56,830 And that's the type that involves unbounded computations. 27 00:00:57,930 --> 00:00:59,400 The basic SQL language does 28 00:00:59,770 --> 00:01:01,840 not have features that allow us to do those. 29 00:01:02,160 --> 00:01:03,530 And, I'll motivate those with a few examples. 30 00:01:04,240 --> 00:01:05,210 I'll just say up front, that 31 00:01:05,390 --> 00:01:08,550 when unbounded computations need to be performed, use a database. 32 00:01:09,320 --> 00:01:10,880 Typically, there's some programming in 33 00:01:11,030 --> 00:01:12,290 a programming language, that will 34 00:01:12,540 --> 00:01:14,910 be accessing the database over and over, to do those computations. 35 00:01:15,830 --> 00:01:16,780 But we're also going to see 36 00:01:17,080 --> 00:01:18,730 that SQL has added a notion 37 00:01:19,090 --> 00:01:21,180 of recursion that allows us to perform unbounded computations. 38 00:01:22,980 --> 00:01:24,080 So, in each of my examples I'm 39 00:01:24,280 --> 00:01:25,330 going to give a relational schema 40 00:01:25,700 --> 00:01:26,770 and then a query we'd like 41 00:01:27,010 --> 00:01:29,040 to write over that schema but we will see that we can't. 42 00:01:29,960 --> 00:01:32,470 The first is a simple ancestors computation. 43 00:01:33,750 --> 00:01:34,890 So, for example, we might 44 00:01:35,210 --> 00:01:36,810 have that Sue is 45 00:01:36,960 --> 00:01:38,350 a parent of Mary and maybe 46 00:01:38,600 --> 00:01:40,000 Bob is also a parent of Mary. 47 00:01:40,990 --> 00:01:42,740 And maybe Fred is 48 00:01:43,120 --> 00:01:45,760 a parent of Bob and Jane 49 00:01:46,290 --> 00:01:47,870 is also a parent of Bob, and so on. 50 00:01:47,980 --> 00:01:48,850 So we're just listing the parent-child 51 00:01:49,440 --> 00:01:52,290 relationships in our relation called parent of. 52 00:01:52,870 --> 00:01:54,280 And then our goal is to use 53 00:01:54,550 --> 00:01:56,800 that relation to compute, say, all of Mary's ancestors. 54 00:01:58,260 --> 00:01:59,370 So we can certainly write a 55 00:01:59,410 --> 00:02:01,670 SQL query that finds Mary's parents. 56 00:02:02,590 --> 00:02:04,070 I'll let you do that as an exercise. 57 00:02:04,580 --> 00:02:05,310 It's very straight forward. 58 00:02:06,390 --> 00:02:08,490 We can even write a query to find 59 00:02:09,270 --> 00:02:09,740 Mary's grandparents. 60 00:02:10,850 --> 00:02:11,830 A query to do that, and 61 00:02:11,920 --> 00:02:13,280 again I'm not gonna write it 62 00:02:13,380 --> 00:02:15,330 here, would involve two instances 63 00:02:16,270 --> 00:02:17,230 of parent of, so you'd 64 00:02:17,410 --> 00:02:19,240 be joining parent of with itself. 65 00:02:19,940 --> 00:02:21,040 We wrote queries of that form 66 00:02:21,810 --> 00:02:23,550 when we were working with basic SQL in our videos. 67 00:02:24,850 --> 00:02:26,260 And you know we could even find 68 00:02:26,770 --> 00:02:29,330 the great grandparents by using 69 00:02:29,810 --> 00:02:31,580 say, three instances in joining them. 70 00:02:32,310 --> 00:02:33,150 The problem is that we 71 00:02:33,260 --> 00:02:34,020 might not know in advance 72 00:02:34,790 --> 00:02:37,240 how many steps there are to get all of Mary's ancestors. 73 00:02:37,890 --> 00:02:38,730 And each one of those 74 00:02:38,850 --> 00:02:40,350 steps does involve an 75 00:02:40,460 --> 00:02:41,640 instance of parent of and 76 00:02:41,800 --> 00:02:43,070 adjoin, so this query 77 00:02:43,560 --> 00:02:45,440 can not be executed using standard SQL. 78 00:02:45,500 --> 00:02:47,670 Here's our second example. 79 00:02:48,380 --> 00:02:48,840 A little more complicated. 80 00:02:49,560 --> 00:02:50,400 It involves three relations. 81 00:02:51,400 --> 00:02:53,210 And they represent a company hierarchy. 82 00:02:53,900 --> 00:02:55,840 We're going to have manager-employee relationships. 83 00:02:56,570 --> 00:02:58,050 We're going to have projects and we're going to have salaries. 84 00:02:58,850 --> 00:03:00,360 So, our first relation, employee, just 85 00:03:00,550 --> 00:03:02,600 gives us the salary of each employee. 86 00:03:03,480 --> 00:03:04,430 Our second one gives us 87 00:03:04,620 --> 00:03:06,880 the manager relationship and what 88 00:03:07,080 --> 00:03:09,180 this is saying is say, that 89 00:03:09,700 --> 00:03:11,250 the employee with ID 90 00:03:11,420 --> 00:03:13,330 123 is a manager. 91 00:03:13,710 --> 00:03:14,500 I'll draw it like a tree here 92 00:03:15,360 --> 00:03:16,610 of two, three, four, who, 93 00:03:16,930 --> 00:03:18,170 himself, might manage a few 94 00:03:18,430 --> 00:03:20,380 other employees and, so on. 95 00:03:20,830 --> 00:03:22,510 So, we have a hierarchical structure 96 00:03:23,100 --> 00:03:24,190 of the management in the company. 97 00:03:25,070 --> 00:03:26,330 And you'll see, you can 98 00:03:26,480 --> 00:03:27,630 already probably recognize that this 99 00:03:27,710 --> 00:03:28,700 is similar to the parent 100 00:03:29,070 --> 00:03:31,130 relationship that we had in our previous example. 101 00:03:32,160 --> 00:03:33,680 And finally we have a 102 00:03:33,800 --> 00:03:35,300 set of projects that gives the 103 00:03:35,470 --> 00:03:38,050 name of the project, and the manager of that project. 104 00:03:38,550 --> 00:03:39,920 And the query that we would 105 00:03:40,110 --> 00:03:41,230 like to run over this 106 00:03:41,540 --> 00:03:42,710 database is to find 107 00:03:42,960 --> 00:03:44,750 the total salary cost of a given project. 108 00:03:45,200 --> 00:03:47,090 Let's say project X. So for 109 00:03:47,330 --> 00:03:49,320 example if 123 happens 110 00:03:50,230 --> 00:03:51,260 to be the manager of project 111 00:03:51,800 --> 00:03:53,330 X, then the total salary 112 00:03:54,140 --> 00:03:55,430 class would be 123's salary 113 00:03:56,490 --> 00:03:58,160 plus 234's and so on down the hierarchy. 114 00:03:59,190 --> 00:04:01,740 So I'm not even going to try to write a SQL query here. 115 00:04:03,600 --> 00:04:04,430 Again you can do that as 116 00:04:04,560 --> 00:04:05,780 an exercise, if we knew 117 00:04:06,000 --> 00:04:07,630 precisely how deep the 118 00:04:07,820 --> 00:04:09,560 tree was below the manager 119 00:04:10,080 --> 00:04:11,500 of project X. Then we 120 00:04:11,670 --> 00:04:12,870 could again use these self 121 00:04:13,250 --> 00:04:14,180 joins, that would be on 122 00:04:14,350 --> 00:04:15,700 the manager relation this time, 123 00:04:16,430 --> 00:04:17,540 to find all the employees 124 00:04:18,420 --> 00:04:19,980 that are in that sub-tree of 125 00:04:20,090 --> 00:04:21,800 project X and add up their salaries. 126 00:04:22,830 --> 00:04:24,190 But if we don't know the 127 00:04:24,270 --> 00:04:25,990 depth of that hierarchy, 128 00:04:27,140 --> 00:04:28,430 then it's impossible for us to 129 00:04:28,530 --> 00:04:29,800 write a SQL query that 130 00:04:29,950 --> 00:04:32,020 goes to arbitrary depth to add up the costs. 131 00:04:33,540 --> 00:04:34,490 By the way, let me mention that 132 00:04:34,600 --> 00:04:35,790 one of the reasons I'm not bothering 133 00:04:36,280 --> 00:04:37,630 to write the exact SQL right 134 00:04:37,850 --> 00:04:38,770 now as I motivate these examples 135 00:04:39,520 --> 00:04:40,380 is that we are going to see 136 00:04:40,620 --> 00:04:42,390 the SQLs shortly when we do the live demo. 137 00:04:43,510 --> 00:04:44,900 And the third example and 138 00:04:45,050 --> 00:04:47,240 my personal favorite is finding airline 139 00:04:47,720 --> 00:04:49,170 flights from a starting point 140 00:04:49,400 --> 00:04:51,000 to an ending point at the cheapest cost. 141 00:04:51,800 --> 00:04:52,930 So let's say we have a 142 00:04:53,090 --> 00:04:54,320 relation here that lists all 143 00:04:54,580 --> 00:04:57,140 flights, the start of the flight, the end of the flight, 144 00:04:57,380 --> 00:04:58,640 the airline and the cost 145 00:04:58,850 --> 00:05:00,060 of the flight, and we're interested 146 00:05:00,480 --> 00:05:01,710 in flying from point A 147 00:05:01,900 --> 00:05:04,020 to point B. Now if 148 00:05:04,130 --> 00:05:05,540 I happen to be a very 149 00:05:05,830 --> 00:05:07,180 conservative traveler, and I 150 00:05:07,250 --> 00:05:08,320 don't want to change planes 151 00:05:08,550 --> 00:05:10,070 more than twice then I'm in 152 00:05:10,160 --> 00:05:11,140 good shape and I could still write 153 00:05:11,400 --> 00:05:12,730 this using SQL, because I'd 154 00:05:12,870 --> 00:05:13,970 be only willing to take 3 155 00:05:14,240 --> 00:05:15,110 flights, and I could just 156 00:05:15,440 --> 00:05:17,570 join three instances of 157 00:05:17,780 --> 00:05:19,970 the flight relation, matching the 158 00:05:20,040 --> 00:05:21,580 destination of a flight 159 00:05:21,740 --> 00:05:22,570 with the origin of the next 160 00:05:22,850 --> 00:05:24,140 one, and then adding up the 161 00:05:24,200 --> 00:05:26,650 costs, and finding the cheapest one. 162 00:05:27,150 --> 00:05:28,860 It's not a trivial SQL query to write. 163 00:05:29,180 --> 00:05:30,740 Again, I'll leave you that as an exercise. 164 00:05:31,360 --> 00:05:32,310 And by the way, I often give 165 00:05:32,520 --> 00:05:34,560 that very query as an exercise in my class. 166 00:05:35,350 --> 00:05:36,890 But when we don't know 167 00:05:37,290 --> 00:05:38,230 the number of flights we want 168 00:05:38,360 --> 00:05:39,170 to take, so if we are 169 00:05:39,300 --> 00:05:41,670 a very frugal traveler 170 00:05:42,510 --> 00:05:43,820 whose willing to spend arbitrary amounts 171 00:05:44,170 --> 00:05:44,920 of time to get the cheapest 172 00:05:45,290 --> 00:05:46,470 flight, then we can't 173 00:05:46,860 --> 00:05:48,720 with regular SQL write a 174 00:05:48,830 --> 00:05:50,850 query that explores arbitrary numbers 175 00:05:51,190 --> 00:05:52,440 of flights to finds the 176 00:05:52,610 --> 00:05:53,460 cheapest way to get from point 177 00:05:53,700 --> 00:05:55,680 A to point B. So as 178 00:05:55,820 --> 00:05:57,200 you've probably surmised, all three 179 00:05:57,380 --> 00:05:58,980 of these examples are going to 180 00:05:59,070 --> 00:06:02,040 be expressible once we have recursion in the SQL language. 181 00:06:03,080 --> 00:06:05,860 So next I'll introduce the with construct in SQL. 182 00:06:06,130 --> 00:06:07,120 The with construct is actually 183 00:06:07,440 --> 00:06:09,000 present even without recursion, 184 00:06:09,800 --> 00:06:11,090 but it is the construct that was 185 00:06:11,280 --> 00:06:12,750 used to add recursion to the language. 186 00:06:13,720 --> 00:06:15,350 So here is the width statement in SQL. 187 00:06:16,480 --> 00:06:18,580 We give the keyword "with", and 188 00:06:18,680 --> 00:06:19,940 then we list one or more 189 00:06:20,230 --> 00:06:21,720 new relation names. 190 00:06:22,560 --> 00:06:24,070 So, R1-RN would be 191 00:06:24,200 --> 00:06:25,430 relations that don't exist in the 192 00:06:25,510 --> 00:06:27,170 database and each one 193 00:06:27,380 --> 00:06:29,370 of those is tied to a query. 194 00:06:29,630 --> 00:06:31,300 So what we're effectively saying 195 00:06:31,650 --> 00:06:32,930 is that R1 is going 196 00:06:33,090 --> 00:06:34,170 to contain the results of query 197 00:06:34,440 --> 00:06:36,850 one or two of the results of query two and so on. 198 00:06:37,480 --> 00:06:39,030 Once we've set that up, 199 00:06:39,510 --> 00:06:40,740 then the final part of 200 00:06:40,800 --> 00:06:42,280 the with is a final query 201 00:06:42,880 --> 00:06:44,410 that can involve any tables in 202 00:06:44,520 --> 00:06:47,250 a database and can also involve these new tables R1 through RN. 203 00:06:47,270 --> 00:06:49,800 In some ways, you can think 204 00:06:49,950 --> 00:06:51,260 of the with statement as setting 205 00:06:51,570 --> 00:06:52,990 up temporary views. 206 00:06:53,500 --> 00:06:54,160 So it sets up a view 207 00:06:54,500 --> 00:06:55,260 for each one of these R's 208 00:06:56,140 --> 00:06:57,350 then runs a query involving the 209 00:06:57,430 --> 00:06:58,410 views, and then the views go 210 00:06:58,630 --> 00:07:01,140 away. Or if you want to think of them as materialized views, 211 00:07:01,630 --> 00:07:04,580 you could think of these as assignment statements where we have the as. 212 00:07:04,970 --> 00:07:06,090 So we create the table R1, 213 00:07:06,250 --> 00:07:08,850 we put the data in, and then we run the query. 214 00:07:09,890 --> 00:07:11,330 In reality most systems implement 215 00:07:11,790 --> 00:07:13,400 the with statement like for generalized 216 00:07:14,220 --> 00:07:15,400 virtual views and they'll rewrite 217 00:07:16,070 --> 00:07:17,530 the query down here to be 218 00:07:18,350 --> 00:07:20,640 expressed over the tables that are used in these queries. 219 00:07:21,940 --> 00:07:23,230 We'll be seeing examples of the 220 00:07:23,380 --> 00:07:24,390 with statement when we get to 221 00:07:24,520 --> 00:07:26,810 our demo, it will of course be the recursive with statement. 222 00:07:27,580 --> 00:07:29,830 Just one more notational point, just like as with 223 00:07:30,100 --> 00:07:30,100 views, 224 00:07:30,960 --> 00:07:32,120 what normally happens is the 225 00:07:32,210 --> 00:07:33,110 schema of one of these 226 00:07:33,380 --> 00:07:34,770 Rs is inherited as the 227 00:07:34,810 --> 00:07:36,290 schema that's the result of 228 00:07:36,510 --> 00:07:37,940 the query that's associated with 229 00:07:38,110 --> 00:07:39,290 the R. But if the user, 230 00:07:39,650 --> 00:07:40,770 I mean the designer, the one 231 00:07:40,970 --> 00:07:42,520 writing the "with" statement, wishes to 232 00:07:42,610 --> 00:07:44,150 have a different schema, different 233 00:07:44,440 --> 00:07:45,620 names for the attributes, those can 234 00:07:45,750 --> 00:07:47,290 be specified explicitly, and then 235 00:07:47,440 --> 00:07:48,330 those would be used in this 236 00:07:48,520 --> 00:07:49,860 query when R1 is referenced. 237 00:07:51,560 --> 00:07:52,700 So now here's the fun part. 238 00:07:53,360 --> 00:07:55,800 We can specify recursive as 239 00:07:55,910 --> 00:07:57,770 a key word after "with" and 240 00:07:58,010 --> 00:07:59,490 that let's us write recursive 241 00:08:00,160 --> 00:08:02,770 queries in this first portion of the "with" statement. 242 00:08:03,460 --> 00:08:05,120 Very specifically, in query 243 00:08:05,490 --> 00:08:06,690 one here, we an actually 244 00:08:07,210 --> 00:08:08,910 reference relation R1 which 245 00:08:09,480 --> 00:08:12,330 is the relation we're defining with Query 1, in a recursive fashion. 246 00:08:13,570 --> 00:08:15,040 In Query 2, we can reference R2. 247 00:08:15,160 --> 00:08:16,670 We can actually also have 248 00:08:16,870 --> 00:08:19,250 a mutual recursion, which I'm going to focus on in a separate video. 249 00:08:19,610 --> 00:08:21,740 But, that would be saying, not this exactly. 250 00:08:22,490 --> 00:08:23,760 That would be saying Query 2 251 00:08:23,840 --> 00:08:25,310 could reference R1 and Query 1 252 00:08:25,510 --> 00:08:26,510 would reference R2, but again, 253 00:08:26,760 --> 00:08:28,210 we'll be talking about that in a separate video. 254 00:08:28,570 --> 00:08:29,410 For now, we'll just talk about 255 00:08:29,640 --> 00:08:31,830 recursion within each specification 256 00:08:32,760 --> 00:08:33,760 of one of these Rs. 257 00:08:35,190 --> 00:08:35,830 By the way, one thing that I wanted 258 00:08:36,220 --> 00:08:38,750 to mention is a sort of syntactic inconsistency. 259 00:08:39,960 --> 00:08:41,780 The SQL standard actually says 260 00:08:42,110 --> 00:08:44,030 that this recursive modifier here 261 00:08:44,660 --> 00:08:46,740 goes with the relation specification. 262 00:08:47,440 --> 00:08:48,650 So if R1 is recursive 263 00:08:49,070 --> 00:08:50,440 we would say recursive R1, if 264 00:08:50,680 --> 00:08:52,370 RN was recursive we'd say recursive RN. 265 00:08:53,790 --> 00:08:55,630 The implementation that we're 266 00:08:55,940 --> 00:08:57,060 going to be using which is 267 00:08:57,130 --> 00:09:00,420 the postgres implementation, actually says that recursive modifies the with. 268 00:09:00,960 --> 00:09:02,070 So when you say with recursive 269 00:09:02,680 --> 00:09:03,510 then any of the RI's 270 00:09:04,420 --> 00:09:05,750 are allowed to be recursive. 271 00:09:06,940 --> 00:09:08,010 Now let me show what the 272 00:09:08,090 --> 00:09:09,460 typical form is of one 273 00:09:09,680 --> 00:09:11,730 of these recursive specifications in the with statement. 274 00:09:12,410 --> 00:09:13,620 So this example I'm giving has 275 00:09:13,930 --> 00:09:16,430 just one R specified in 276 00:09:16,670 --> 00:09:18,710 the with statement and then the query at the bottom. 277 00:09:19,180 --> 00:09:20,100 Again, this is the final result 278 00:09:20,770 --> 00:09:22,880 of the recursive with statement, involves 279 00:09:23,370 --> 00:09:25,030 R and possibly other tables of the database. 280 00:09:25,830 --> 00:09:26,890 The typical way to define 281 00:09:27,560 --> 00:09:29,360 a recursive query is to 282 00:09:29,540 --> 00:09:30,960 have a base query and that 283 00:09:31,110 --> 00:09:33,040 would be over non, over tables 284 00:09:33,610 --> 00:09:35,840 other than R so not R in this base query. 285 00:09:36,160 --> 00:09:37,160 Sort of to get the recursion 286 00:09:37,590 --> 00:09:38,890 started, and then R 287 00:09:39,160 --> 00:09:40,430 will be the result of that 288 00:09:40,790 --> 00:09:43,050 base query together with the recursive query. 289 00:09:43,370 --> 00:09:45,170 So here we will reference R. 290 00:09:46,160 --> 00:09:47,690 And the idea is that the 291 00:09:47,940 --> 00:09:49,170 result of R, the 292 00:09:49,380 --> 00:09:50,650 R that's seen when we 293 00:09:50,840 --> 00:09:52,780 run the query down here, is 294 00:09:52,980 --> 00:09:54,170 what's known as the fixed point 295 00:09:54,410 --> 00:09:56,780 of running this union over 296 00:09:57,220 --> 00:09:58,650 and over again until we 297 00:09:58,900 --> 00:10:01,330 add no additional tuples to 298 00:10:01,800 --> 00:10:03,420 R. One other thing 299 00:10:03,570 --> 00:10:04,650 that I wanted to mention is that 300 00:10:04,910 --> 00:10:06,390 this form of the recursion, 301 00:10:07,120 --> 00:10:08,530 this idea here that we 302 00:10:08,730 --> 00:10:10,160 have the base query and 303 00:10:10,540 --> 00:10:11,880 a union with the recursive query 304 00:10:12,140 --> 00:10:14,200 is not enforced in the SQL standard. 305 00:10:15,110 --> 00:10:16,540 The SQL standard merely says 306 00:10:16,890 --> 00:10:18,320 that we can specify R here 307 00:10:19,230 --> 00:10:20,490 and then any query inside 308 00:10:20,970 --> 00:10:22,260 the parenthesis that reference R. 309 00:10:22,700 --> 00:10:23,760 Although, there are a 310 00:10:23,840 --> 00:10:25,790 number of restrictions, this division 311 00:10:26,310 --> 00:10:28,570 into a base query and recursive query isn't required. 312 00:10:29,320 --> 00:10:30,300 We'll be talking about some of 313 00:10:30,370 --> 00:10:32,610 those restrictions late on, actually in a separate video. 314 00:10:33,850 --> 00:10:34,900 In the implementation that we're using, 315 00:10:35,370 --> 00:10:36,860 the implementation, it actually 316 00:10:37,160 --> 00:10:38,560 does require this form where 317 00:10:38,610 --> 00:10:39,720 you have a base query, union, and 318 00:10:39,960 --> 00:10:41,130 recursive query, but that's not really 319 00:10:41,320 --> 00:10:43,050 a problem because all natural recursive 320 00:10:44,370 --> 00:10:45,560 queries, at least the 321 00:10:45,900 --> 00:10:46,840 ones I've seen and for the 322 00:10:46,880 --> 00:10:48,570 examples we're going to look at, do take this form. 323 00:10:49,950 --> 00:10:52,650 And one last thing I want to mention is about the UNION operator. 324 00:10:53,700 --> 00:10:55,110 First a reminder that in 325 00:10:55,470 --> 00:10:56,910 SQL when we say UNION, as 326 00:10:57,000 --> 00:10:58,260 opposed to UNION ALL, we're 327 00:10:58,410 --> 00:11:00,480 talking about duplicate eliminating union, 328 00:11:01,290 --> 00:11:02,280 that means that when we 329 00:11:02,530 --> 00:11:03,710 add a tuple to our 330 00:11:03,830 --> 00:11:06,110 union that's already there, we don't actually add a tuple. 331 00:11:06,850 --> 00:11:08,050 And that's really key to this 332 00:11:08,300 --> 00:11:09,800 notion of reaching a fixed 333 00:11:09,940 --> 00:11:11,920 point or with the recursion terminating, because 334 00:11:12,300 --> 00:11:13,730 we might be running the recursive 335 00:11:14,290 --> 00:11:15,450 query that over and over 336 00:11:15,670 --> 00:11:17,270 gives us additional tuples, but 337 00:11:17,370 --> 00:11:18,450 if those are all tuples that 338 00:11:18,590 --> 00:11:19,720 we already have in R, 339 00:11:20,310 --> 00:11:21,360 then we won't actually be adding 340 00:11:21,670 --> 00:11:22,510 anything new, because the union 341 00:11:22,880 --> 00:11:25,670 eliminates duplicates, and that will tell us our recursion is done. 342 00:11:26,710 --> 00:11:27,810 As you can imagine, if I 343 00:11:27,870 --> 00:11:29,820 wrote UNION ALL instead, then, 344 00:11:30,000 --> 00:11:31,330 I'm continuously adding new tuples 345 00:11:32,160 --> 00:11:33,590 and my recursion will typically 346 00:11:34,350 --> 00:11:35,840 not terminate, so it's not 347 00:11:36,220 --> 00:11:37,950 common, maybe never, to 348 00:11:38,050 --> 00:11:39,630 see UNION ALL used in 349 00:11:39,800 --> 00:11:41,130 recursion, but rather the 350 00:11:41,440 --> 00:11:42,930 duplicate eliminating UNION operator. 351 00:11:44,040 --> 00:11:45,030 So now that we've seen a 352 00:11:45,170 --> 00:11:46,340 number of examples that need recursion, 353 00:11:47,120 --> 00:11:48,300 and we've seen how recursion has 354 00:11:48,370 --> 00:11:49,890 been introduced into SQL, the 355 00:11:49,950 --> 00:11:51,060 next video will give a 356 00:11:51,110 --> 00:11:52,780 demo of these queries in action.