1 00:00:00,150 --> 00:00:01,300 In this demo we're going 2 00:00:01,540 --> 00:00:03,800 to run a number of queries in the X query language. 3 00:00:04,730 --> 00:00:07,570 We'll be using the same XML data we've used for previous demos. 4 00:00:08,280 --> 00:00:09,530 Here we can see the book 5 00:00:09,720 --> 00:00:11,420 store database, with a number 6 00:00:11,690 --> 00:00:13,890 of books, four books, and a few magazines. 7 00:00:15,740 --> 00:00:17,510 X query language is quite complex. 8 00:00:18,540 --> 00:00:19,930 Certainly more complex than SQL. 9 00:00:20,870 --> 00:00:21,980 We'll be covering a number of 10 00:00:22,080 --> 00:00:23,370 the constructs, but we won't 11 00:00:23,600 --> 00:00:24,920 be able to cover everything, and 12 00:00:25,110 --> 00:00:27,630 even the ones we cover will take a little getting used to. 13 00:00:28,360 --> 00:00:29,540 I highly suggest that you 14 00:00:29,670 --> 00:00:30,800 take time to pause the video 15 00:00:31,120 --> 00:00:31,780 to take a look at the 16 00:00:31,840 --> 00:00:33,030 queries, or even better, 17 00:00:33,320 --> 00:00:35,830 download the data and the queries and try them for yourself. 18 00:00:37,290 --> 00:00:38,420 Our first query is a fairly 19 00:00:38,590 --> 00:00:40,230 simple one, demonstrating a FLWOR, 20 00:00:40,510 --> 00:00:42,820 or F-L-W-O-R, expression. 21 00:00:43,420 --> 00:00:45,630 That's the basic expression of the X query language. 22 00:00:46,370 --> 00:00:47,480 Here we only have the F, 23 00:00:47,730 --> 00:00:50,000 W and R part, "for" "where" and "return". 24 00:00:51,160 --> 00:00:52,220 This query is going to 25 00:00:52,310 --> 00:00:53,580 return the titles of books 26 00:00:53,790 --> 00:00:54,810 that cost less than ninety 27 00:00:55,080 --> 00:00:56,410 dollars where Ullman is an author. 28 00:00:57,290 --> 00:00:59,280 Let's go ahead and take a look at the constructs of the query. 29 00:01:00,240 --> 00:01:01,810 The four construct, as we 30 00:01:01,920 --> 00:01:03,160 described, has an expression 31 00:01:03,980 --> 00:01:05,070 in this case it's an X Path 32 00:01:05,340 --> 00:01:07,760 expression that returns the books in our document. 33 00:01:08,740 --> 00:01:10,010 It binds the variable dollar 34 00:01:10,460 --> 00:01:11,590 B to each book one 35 00:01:11,950 --> 00:01:13,090 at a time and run the 36 00:01:13,200 --> 00:01:14,310 rest of the query for that 37 00:01:14,570 --> 00:01:16,130 finding, the next thing 38 00:01:16,230 --> 00:01:17,210 it does is check whether the 39 00:01:17,390 --> 00:01:18,360 price of the book is less 40 00:01:18,680 --> 00:01:20,230 than ninety, and whether there 41 00:01:20,450 --> 00:01:22,320 exists an author slash 42 00:01:22,710 --> 00:01:24,190 author the last name under 43 00:01:24,430 --> 00:01:26,770 the book element where the value of that is Ullman. 44 00:01:27,730 --> 00:01:28,650 In the case where both of 45 00:01:28,720 --> 00:01:31,550 these conditions are true, it will return the title of the book. 46 00:01:32,140 --> 00:01:32,810 Let's run the query. 47 00:01:33,170 --> 00:01:35,080 When we run the 48 00:01:35,150 --> 00:01:36,080 query, we see that we get 49 00:01:36,230 --> 00:01:37,520 a result that's formatted as 50 00:01:37,630 --> 00:01:39,010 XML with two titles 51 00:01:39,720 --> 00:01:42,150 of the books that cost less than ninety and Ullman is an author. 52 00:01:43,790 --> 00:01:45,060 Our next query is a bit more complicated. 53 00:01:46,130 --> 00:01:47,080 It still consists of a 54 00:01:47,190 --> 00:01:48,690 for, where, and return clause, 55 00:01:49,390 --> 00:01:50,450 but in the where clause we have 56 00:01:50,980 --> 00:01:52,680 existential quantification and then 57 00:01:52,830 --> 00:01:54,090 in the return clause we're constructing 58 00:01:54,780 --> 00:01:55,680 our result in XML. 59 00:01:57,070 --> 00:01:58,180 What this query looks for is 60 00:01:58,360 --> 00:01:59,400 books where the title of 61 00:01:59,540 --> 00:02:01,500 the book contains one of the author's first names. 62 00:02:02,330 --> 00:02:03,390 For each of those books it 63 00:02:03,510 --> 00:02:06,100 returns the book title, and the first name of the author. 64 00:02:07,160 --> 00:02:08,250 So we again in our 65 00:02:08,500 --> 00:02:10,710 for clause, bind dollar 66 00:02:11,050 --> 00:02:12,330 B to each each book in the database. 67 00:02:13,410 --> 00:02:15,030 Then in our where condition, what 68 00:02:15,180 --> 00:02:16,170 we look for is a 69 00:02:16,410 --> 00:02:18,100 first name subelement of the 70 00:02:18,170 --> 00:02:19,380 book which gets bound to 71 00:02:19,530 --> 00:02:21,610 FN, such that, that 72 00:02:22,350 --> 00:02:24,690 the title of the book contains that first name. 73 00:02:25,410 --> 00:02:26,810 So these as a existential 74 00:02:27,330 --> 00:02:29,950 quantified expression some in 75 00:02:30,280 --> 00:02:31,560 and satisfies our keywords. 76 00:02:32,580 --> 00:02:33,740 So some binds to a variable 77 00:02:34,030 --> 00:02:35,230 which we bind in the 78 00:02:35,420 --> 00:02:36,470 set, and then we 79 00:02:36,550 --> 00:02:37,370 check if any of those 80 00:02:37,590 --> 00:02:38,950 bindings satisfy this condition. 81 00:02:40,850 --> 00:02:41,870 Again, I urge you to take a close look at this. 82 00:02:42,370 --> 00:02:43,250 You may want to pause the video. 83 00:02:44,810 --> 00:02:46,400 Once we've determined that our 84 00:02:46,620 --> 00:02:48,660 book satisfies the condition, then 85 00:02:48,840 --> 00:02:50,740 in return clause, we're constructing 86 00:02:51,720 --> 00:02:52,880 an XML element where the 87 00:02:53,020 --> 00:02:54,650 opening and closing tags are book. 88 00:02:55,480 --> 00:02:56,790 And then within that, we return 89 00:02:57,130 --> 00:02:58,250 the title of the book and 90 00:02:58,500 --> 00:02:59,770 we return the first name 91 00:03:00,570 --> 00:03:02,180 of all authors of the book. 92 00:03:02,840 --> 00:03:02,840 Whew. 93 00:03:03,250 --> 00:03:03,900 So let's run the query. 94 00:03:05,300 --> 00:03:06,340 We find that there are two 95 00:03:06,480 --> 00:03:08,690 books satisfying the conditions where 96 00:03:08,900 --> 00:03:09,720 there is a first name in 97 00:03:09,800 --> 00:03:11,000 the title, Hector and Jeff's 98 00:03:11,280 --> 00:03:14,060 Database Hints and Jennifer's Economical Database Hints. 99 00:03:14,870 --> 00:03:16,300 Now there is tricky thing 100 00:03:16,560 --> 00:03:18,160 here which is that in 101 00:03:18,490 --> 00:03:19,990 Hector and Jeff's Database Hints 102 00:03:20,320 --> 00:03:21,760 that title does in fact 103 00:03:22,030 --> 00:03:23,460 contain a first name contains 104 00:03:23,750 --> 00:03:25,450 Hector, but it does not contain Jeffrey. 105 00:03:26,510 --> 00:03:27,640 Our query returned the title 106 00:03:28,040 --> 00:03:29,940 of books satisfying the condition with 107 00:03:30,280 --> 00:03:33,340 every first name of an author of that book. 108 00:03:34,210 --> 00:03:34,970 What we're going to do next 109 00:03:35,780 --> 00:03:36,990 is restrict the result 110 00:03:37,410 --> 00:03:38,480 of the query to only return 111 00:03:38,870 --> 00:03:40,980 the first names that are actually part of the book title. 112 00:03:42,490 --> 00:03:43,520 What we're going to do is 113 00:03:43,730 --> 00:03:45,300 modify the second portion 114 00:03:45,580 --> 00:03:47,530 of our return statement to be a little more complicated. 115 00:03:48,290 --> 00:03:49,080 In fact, we're gonna put 116 00:03:49,470 --> 00:03:52,300 an entire for return a 117 00:03:52,700 --> 00:03:54,270 query right inside the braces here. 118 00:03:55,340 --> 00:03:56,300 In XQuery we can mix 119 00:03:56,640 --> 00:03:59,240 and match queries and expressions as we wish. 120 00:03:59,970 --> 00:04:01,110 So what I've done now, 121 00:04:01,730 --> 00:04:02,960 is I'm again returning the title 122 00:04:03,760 --> 00:04:04,940 but in the second clause, 123 00:04:05,360 --> 00:04:06,450 instead of returning all the 124 00:04:06,590 --> 00:04:07,840 first names, I'm going 125 00:04:07,970 --> 00:04:09,320 to find the first names that 126 00:04:09,500 --> 00:04:10,980 are within the book and 127 00:04:11,070 --> 00:04:12,500 when the title contains the 128 00:04:12,630 --> 00:04:14,000 first name then I'll return that one. 129 00:04:14,280 --> 00:04:15,680 So effectively I'm restricting the 130 00:04:15,820 --> 00:04:18,730 first names returned to just be the ones that appear in the title. 131 00:04:19,650 --> 00:04:20,820 Let's run the query and 132 00:04:21,050 --> 00:04:23,340 we can see that correctly Jeffrey 133 00:04:23,810 --> 00:04:25,970 disappeared from the first book element returned. 134 00:04:27,400 --> 00:04:29,670 Our next query again demonstrates a few new constructs. 135 00:04:31,020 --> 00:04:31,780 First of all, we'll be using 136 00:04:32,050 --> 00:04:34,700 the let clause in this query rather than the for clause. 137 00:04:35,620 --> 00:04:37,380 Second of all, we'll be showing some aggregation. 138 00:04:38,490 --> 00:04:41,160 And finally, we've embedded the full query inside XML. 139 00:04:42,200 --> 00:04:43,810 So we've put averages, opening 140 00:04:44,180 --> 00:04:45,540 and closing tags for our 141 00:04:45,750 --> 00:04:48,820 result and within that, we're putting our entire query. 142 00:04:49,680 --> 00:04:50,800 Our query says that we're 143 00:04:50,990 --> 00:04:52,260 going to assign the variable 144 00:04:52,870 --> 00:04:54,690 key list to the result of this expression. 145 00:04:55,250 --> 00:04:56,980 So remember the for clause 146 00:04:57,380 --> 00:04:59,560 is an iterator, while the let clause is an assignment. 147 00:05:00,570 --> 00:05:02,020 So this expression is going 148 00:05:02,190 --> 00:05:03,150 to find all of the 149 00:05:03,260 --> 00:05:04,740 price attributes in the 150 00:05:04,800 --> 00:05:06,310 database, assign it the 151 00:05:06,690 --> 00:05:08,200 P-List as a list, and 152 00:05:08,300 --> 00:05:09,430 then it will return the average 153 00:05:10,000 --> 00:05:11,300 of the elements in that list. 154 00:05:11,880 --> 00:05:12,760 Let's go ahead and run the 155 00:05:12,840 --> 00:05:14,090 query, and we see that 156 00:05:14,250 --> 00:05:16,020 our average book price is sixty five. 157 00:05:17,380 --> 00:05:18,520 We can actually run this query 158 00:05:19,020 --> 00:05:20,440 in an even more compact fashion, 159 00:05:21,210 --> 00:05:23,570 we can assign variable 'A' 160 00:05:24,000 --> 00:05:25,780 to be the average of 161 00:05:25,900 --> 00:05:27,300 this entire expression here, and 162 00:05:28,500 --> 00:05:30,050 then we can just return 'A'. 163 00:05:31,360 --> 00:05:34,020 Not much more compact, but perhaps a little bit more intuitive. 164 00:05:35,030 --> 00:05:37,700 Let's do that, let's erase 165 00:05:38,140 --> 00:05:39,180 the answer to run the query 166 00:05:39,830 --> 00:05:41,190 again, and again we get the same result. 167 00:05:43,200 --> 00:05:46,000 Now let's see a query where we use both the "let" and the "for" expressions. 168 00:05:47,260 --> 00:05:48,270 In this query, we're going to 169 00:05:48,340 --> 00:05:49,900 start by assigning dollar A 170 00:05:50,140 --> 00:05:51,560 to the average book price, just 171 00:05:51,750 --> 00:05:52,920 like we did in the previous query, 172 00:05:53,250 --> 00:05:54,440 and then we're going to 173 00:05:54,640 --> 00:05:57,110 find all books where the price of the book is below average. 174 00:05:57,650 --> 00:05:59,210 So we'll, again, as in 175 00:05:59,300 --> 00:06:00,950 previous queries, assign dollar 176 00:06:01,250 --> 00:06:02,290 B one at a time 177 00:06:02,390 --> 00:06:03,160 to the books in our database, 178 00:06:04,130 --> 00:06:05,100 and then for each one we'll 179 00:06:05,290 --> 00:06:06,360 check if the price is 180 00:06:06,480 --> 00:06:08,780 less than the average price that we had from our "let" clause. 181 00:06:09,110 --> 00:06:10,700 If it is, we'll return 182 00:06:11,050 --> 00:06:12,480 the book element, and we'll return 183 00:06:12,850 --> 00:06:13,910 the title of the book and 184 00:06:14,180 --> 00:06:15,880 we'll put, as a subelement 185 00:06:16,420 --> 00:06:17,740 in this book element, the price. 186 00:06:18,650 --> 00:06:19,570 And here we can see where we're 187 00:06:19,740 --> 00:06:22,030 taking an attribute and we're turning it into an element. 188 00:06:22,440 --> 00:06:23,580 So we have our attribute the 189 00:06:23,730 --> 00:06:24,730 price, we obtain the data, 190 00:06:25,600 --> 00:06:26,690 as we saw we needed to 191 00:06:26,780 --> 00:06:28,000 do in the X demo, and 192 00:06:28,210 --> 00:06:30,130 we place it inside the price element. 193 00:06:31,100 --> 00:06:32,530 Let's run the query, and we 194 00:06:32,640 --> 00:06:33,690 see indeed that we have 195 00:06:33,990 --> 00:06:35,410 two prices whose books are 196 00:06:35,610 --> 00:06:36,980 below average, and here we've 197 00:06:37,210 --> 00:06:39,920 converted the price from an attribute to a sub element. 198 00:06:41,490 --> 00:06:43,060 As a reminder, the FLWOR expression 199 00:06:44,060 --> 00:06:46,310 in XQuery has a 200 00:06:46,340 --> 00:06:48,430 For, Let, Wear, Order by, and Return. 201 00:06:49,490 --> 00:06:51,250 We see in all of the clauses except the "order by". 202 00:06:51,480 --> 00:06:53,290 So let's see the "order by" in this example. 203 00:06:54,280 --> 00:06:55,370 We're going to find the title 204 00:06:55,560 --> 00:06:57,460 and prices of books, just as we found before. 205 00:06:57,760 --> 00:06:59,260 We'll convert the price to a sub element. 206 00:06:59,990 --> 00:07:00,880 But in this case we want 207 00:07:01,200 --> 00:07:03,460 to order the result by the price of the book. 208 00:07:04,060 --> 00:07:05,060 So we do so by adding 209 00:07:05,390 --> 00:07:06,630 this clause here between the 210 00:07:06,720 --> 00:07:07,970 for and return saying we 211 00:07:08,070 --> 00:07:10,190 want to order by the price easy enough. 212 00:07:10,700 --> 00:07:12,600 We run the query and we see out result. 213 00:07:13,800 --> 00:07:15,620 Actually it doesn't look quite right does it? 214 00:07:15,680 --> 00:07:17,130 We have one hundred before 2550 215 00:07:17,510 --> 00:07:18,760 and eighty five. 216 00:07:19,580 --> 00:07:20,820 Well, that's because price is 217 00:07:21,050 --> 00:07:22,650 actually a string, and so 218 00:07:22,970 --> 00:07:25,790 it was doing a lexical graphic ordering of the string. 219 00:07:26,610 --> 00:07:27,890 We can fix that easily by 220 00:07:28,170 --> 00:07:30,770 calling a built-in function called 221 00:07:31,280 --> 00:07:32,780 XS:INT that converts that value 222 00:07:33,810 --> 00:07:34,370 to an integer. 223 00:07:35,250 --> 00:07:36,160 When we run the query now, 224 00:07:36,670 --> 00:07:39,610 we get the correct ordering based on the values of the price. 225 00:07:39,830 --> 00:07:42,380 Now that we've seen 226 00:07:42,720 --> 00:07:44,730 ordering, let's take a look at duplicate elimination. 227 00:07:46,320 --> 00:07:49,170 Let's a query that finds all the last names in our database. 228 00:07:50,070 --> 00:07:51,260 So we write a simple query that 229 00:07:51,450 --> 00:07:52,950 says for all names in 230 00:07:53,370 --> 00:07:54,980 the X path expression that 231 00:07:55,100 --> 00:07:56,470 finds the last names and, just 232 00:07:56,620 --> 00:07:57,570 as a reminder, here we use 233 00:07:57,780 --> 00:07:59,240 the double slash that looks 234 00:07:59,450 --> 00:08:00,380 at any depth in the XML 235 00:08:00,820 --> 00:08:01,680 tree and picks out the last 236 00:08:02,030 --> 00:08:03,250 names, will return those 237 00:08:03,450 --> 00:08:05,830 last names, and we 238 00:08:05,980 --> 00:08:06,840 know that these are our last 239 00:08:07,150 --> 00:08:08,080 names, and we've got many 240 00:08:08,490 --> 00:08:09,810 repeats of them because these 241 00:08:09,970 --> 00:08:11,670 last names appear several times in the database. 242 00:08:12,800 --> 00:08:14,540 So let's see about getting rid of those repeats. 243 00:08:15,820 --> 00:08:17,900 There is a built-in function in 244 00:08:18,340 --> 00:08:19,890 XQuery called "distinct values". 245 00:08:20,860 --> 00:08:21,720 So what we can do is 246 00:08:22,080 --> 00:08:27,510 add distinct values here in our for clause. 247 00:08:29,640 --> 00:08:30,680 We can apply it to the 248 00:08:30,780 --> 00:08:32,380 last name, and now our 249 00:08:32,660 --> 00:08:33,880 dollar N will be bound to 250 00:08:34,310 --> 00:08:36,730 each value only once and then we'll return the result. 251 00:08:38,520 --> 00:08:39,900 We run the query, and we 252 00:08:40,170 --> 00:08:41,210 find that we have our 253 00:08:41,680 --> 00:08:42,940 three last names appearing 254 00:08:43,330 --> 00:08:45,230 only once, but it's probably not quite what we wanted. 255 00:08:46,200 --> 00:08:47,150 This time when we run distinct 256 00:08:47,540 --> 00:08:48,770 values it just turns these 257 00:08:48,960 --> 00:08:50,090 three values into three separate 258 00:08:50,630 --> 00:08:51,820 strings and the returns 259 00:08:52,130 --> 00:08:53,580 one at a time rather than 260 00:08:53,770 --> 00:08:54,900 embedding them with the last 261 00:08:55,320 --> 00:08:56,250 name tag, which was what 262 00:08:56,360 --> 00:08:57,890 we got when we didn't use distinct values. 263 00:08:59,030 --> 00:09:00,230 So if we want the last name 264 00:09:00,630 --> 00:09:01,920 tag, then we can add 265 00:09:02,440 --> 00:09:04,030 the last name here to our 266 00:09:04,280 --> 00:09:06,290 return clause, let's just 267 00:09:06,780 --> 00:09:08,700 put in the opening tag and 268 00:09:08,860 --> 00:09:10,790 the closing tag was put in for us. 269 00:09:12,050 --> 00:09:13,660 So let's put dollar N here. 270 00:09:14,010 --> 00:09:15,390 So now we've added opening and 271 00:09:15,610 --> 00:09:16,380 closing tags to our previous 272 00:09:16,900 --> 00:09:18,110 query, we run it 273 00:09:19,170 --> 00:09:20,290 and whoops what do we 274 00:09:20,390 --> 00:09:21,730 get, we got dollar N. 275 00:09:22,700 --> 00:09:24,390 This is a reminder that when 276 00:09:24,650 --> 00:09:26,920 we write a return clause if 277 00:09:27,170 --> 00:09:28,390 we want actually have an 278 00:09:28,450 --> 00:09:31,790 expression evaluated in the curly brackets. 279 00:09:32,750 --> 00:09:33,940 So by putting dollar N in 280 00:09:34,060 --> 00:09:36,320 curly brackets here, now 281 00:09:36,470 --> 00:09:37,780 the execution engine will actually 282 00:09:37,850 --> 00:09:39,000 evaluate this query and 283 00:09:39,140 --> 00:09:40,590 put the result within the opening 284 00:09:41,020 --> 00:09:42,260 and closing tags, we run 285 00:09:42,480 --> 00:09:44,000 the query and now finally we 286 00:09:44,110 --> 00:09:45,540 got exactly what we were looking for. 287 00:09:46,100 --> 00:09:47,420 The three last names that appear 288 00:09:47,600 --> 00:09:49,290 in the database with the duplicates eliminated. 289 00:09:50,890 --> 00:09:51,960 In a previous query we 290 00:09:52,120 --> 00:09:54,730 saw existential quantification in the where clause. 291 00:09:55,570 --> 00:09:56,680 We were using the some keyword. 292 00:09:57,190 --> 00:09:57,190 S-O-M-E. 293 00:09:57,370 --> 00:09:59,040 Now we're going to see 294 00:09:59,240 --> 00:10:01,150 universal quantification in the where clause. 295 00:10:01,560 --> 00:10:04,350 So for all, and for that we use the every keyword. 296 00:10:04,550 --> 00:10:06,100 What this query finds 297 00:10:06,700 --> 00:10:08,020 is books where every author's 298 00:10:08,820 --> 00:10:10,150 first name includes the letter 299 00:10:10,560 --> 00:10:11,990 J. Once again, we 300 00:10:12,070 --> 00:10:13,380 use the for clause to bind 301 00:10:13,630 --> 00:10:14,660 $B to every book in 302 00:10:14,750 --> 00:10:16,190 the database, and then for 303 00:10:16,430 --> 00:10:18,140 each book, we check our where clause. 304 00:10:18,660 --> 00:10:20,040 Our where clause requires that 305 00:10:20,340 --> 00:10:22,030 every first name that's a 306 00:10:22,370 --> 00:10:23,940 sub-element in that book satisfies 307 00:10:24,600 --> 00:10:25,710 the condition where the first 308 00:10:26,030 --> 00:10:27,650 name contains the letter "J". 309 00:10:28,790 --> 00:10:31,480 If we have such a book, then we return it as a result of the query. 310 00:10:32,320 --> 00:10:33,170 So now we can see that 311 00:10:33,240 --> 00:10:34,330 our result is "A First 312 00:10:34,680 --> 00:10:36,200 Course in Database Systems" and "Jennifer's 313 00:10:36,650 --> 00:10:38,030 Economical Database Hints", because 314 00:10:38,330 --> 00:10:40,980 for both of those, all of the first names include the letter "J". 315 00:10:42,500 --> 00:10:44,170 This query demonstrates what I 316 00:10:44,320 --> 00:10:45,650 like to call a self join, 317 00:10:46,260 --> 00:10:48,900 because it's combining two copies of the book data. 318 00:10:49,690 --> 00:10:50,630 Specifically, what the query 319 00:10:50,970 --> 00:10:53,190 is looking for are titles of 320 00:10:53,430 --> 00:10:54,630 books that share a common 321 00:10:54,900 --> 00:10:55,890 author, and we'll assume 322 00:10:56,210 --> 00:10:57,710 we're just looking for common last names. 323 00:10:58,360 --> 00:10:59,560 This is also the first query 324 00:10:59,790 --> 00:11:02,040 we're seeing where we have two copies of the FOR clause. 325 00:11:02,700 --> 00:11:03,880 So we're setting up two iterator 326 00:11:04,400 --> 00:11:05,680 variables, essentially, one called 327 00:11:05,900 --> 00:11:07,020 B1 and one called B2, 328 00:11:07,440 --> 00:11:09,710 that are both iterating over the books in the database. 329 00:11:10,790 --> 00:11:11,740 If you'd like to think of 330 00:11:11,840 --> 00:11:13,380 this in relational terms, it would 331 00:11:13,540 --> 00:11:14,970 be similar to having two 332 00:11:15,200 --> 00:11:16,330 copies of the book table--one 333 00:11:17,040 --> 00:11:19,230 of them bound to variable B1 and one to variable B2. 334 00:11:21,120 --> 00:11:22,120 Then what we look for 335 00:11:22,160 --> 00:11:23,590 in the WHERE clause is whether 336 00:11:23,960 --> 00:11:26,790 B1 and B2 have an author last name in common. 337 00:11:27,350 --> 00:11:28,340 And we do that by checking 338 00:11:28,720 --> 00:11:31,290 these two XPath expressions and seeing if they're equal. 339 00:11:32,210 --> 00:11:33,250 Now we saw this behavior in 340 00:11:33,380 --> 00:11:35,070 XPath and we're seeing it again 341 00:11:35,380 --> 00:11:36,320 here, where the WHERE 342 00:11:36,660 --> 00:11:38,690 clause has effectively an existential 343 00:11:39,390 --> 00:11:41,030 quantification occurring automatically. 344 00:11:42,360 --> 00:11:43,670 What it's looking for is if 345 00:11:43,860 --> 00:11:45,480 there is some last name under 346 00:11:45,930 --> 00:11:47,180 B1 that is equal to 347 00:11:47,310 --> 00:11:48,750 some last name under "B2", 348 00:11:49,080 --> 00:11:51,400 even if "B1" or "B2" have many last names. 349 00:11:52,470 --> 00:11:53,770 In the case where that equality 350 00:11:53,900 --> 00:11:55,380 is satisfied, then we 351 00:11:55,550 --> 00:11:56,850 will construct a book pair, 352 00:11:57,600 --> 00:12:00,170 where we return the first title and the second title. 353 00:12:01,150 --> 00:12:03,690 And here, we're taking the titles which were originally attributes. 354 00:12:04,270 --> 00:12:05,620 Oh no, I'm sorry, those were originally sub-elements. 355 00:12:06,270 --> 00:12:07,920 We're taking those sub-element titles and 356 00:12:08,160 --> 00:12:10,320 we're calling them title 1 and title 2. 357 00:12:10,790 --> 00:12:11,990 So let's go ahead and 358 00:12:12,140 --> 00:12:13,590 run the query, and we did 359 00:12:13,810 --> 00:12:14,910 get a bunch of book pairs, 360 00:12:15,810 --> 00:12:16,960 probably more than we expected. 361 00:12:18,160 --> 00:12:19,380 Now all these books do in 362 00:12:19,550 --> 00:12:20,920 fact have a common common 363 00:12:21,180 --> 00:12:22,330 last name in their authors, 364 00:12:23,220 --> 00:12:24,420 but what we forgot to do 365 00:12:24,610 --> 00:12:26,160 was eliminate books with themselves. 366 00:12:27,080 --> 00:12:28,320 Because of course every book has 367 00:12:28,590 --> 00:12:30,150 in common, a last name with itself. 368 00:12:31,330 --> 00:12:32,970 So let's modify our 369 00:12:33,090 --> 00:12:34,480 query so that we 370 00:12:34,750 --> 00:12:36,150 return only books where 371 00:12:36,340 --> 00:12:37,740 they are two different books, and 372 00:12:37,840 --> 00:12:38,880 we can do that fairly easily 373 00:12:39,280 --> 00:12:44,210 by just saying $b1/title is not equal to $b2/title. 374 00:12:45,860 --> 00:12:49,720 Okay, here we we go. 375 00:12:50,030 --> 00:12:51,850 We run the query and we 376 00:12:52,000 --> 00:12:54,050 got an error because I 377 00:12:54,370 --> 00:12:55,950 typed BT instead of B2. 378 00:12:56,340 --> 00:12:57,230 Let's run that again. 379 00:12:57,990 --> 00:12:57,990 Alright. 380 00:12:58,870 --> 00:13:00,110 And now we see that 381 00:13:00,330 --> 00:13:03,030 we now are not getting the pairs of books with themselves. 382 00:13:03,890 --> 00:13:06,230 We're still, perhaps, getting more than we expected, however. 383 00:13:06,670 --> 00:13:07,440 What's going on? 384 00:13:08,230 --> 00:13:09,520 Well, one thing we're seeing 385 00:13:10,010 --> 00:13:11,290 is that we're getting every pair 386 00:13:11,770 --> 00:13:14,500 twice in both orders, if you take a look at the data. 387 00:13:15,460 --> 00:13:16,460 So this is a same old 388 00:13:16,710 --> 00:13:17,950 trick that you may remember 389 00:13:18,360 --> 00:13:20,080 back from the relational query examples, 390 00:13:21,270 --> 00:13:22,330 instead of doing not equals, 391 00:13:23,050 --> 00:13:24,310 let's try doing less than. 392 00:13:25,140 --> 00:13:25,940 In other words, we're going to 393 00:13:26,030 --> 00:13:27,650 return each pair once, and 394 00:13:27,750 --> 00:13:29,180 we're going to have the lexical 395 00:13:29,720 --> 00:13:31,910 graphically lesser book listed first. 396 00:13:32,700 --> 00:13:33,760 Now we've run the query again, 397 00:13:34,020 --> 00:13:36,970 and now we see we actually got what we were expecting. 398 00:13:37,580 --> 00:13:39,810 As our grand finale, 399 00:13:40,900 --> 00:13:42,690 we're going to invert our bookstore database. 400 00:13:43,860 --> 00:13:44,980 The way we set up our XML 401 00:13:45,280 --> 00:13:46,510 data, we had our books, 402 00:13:47,000 --> 00:13:49,300 and then underneath our books we had the authors of the books. 403 00:13:50,160 --> 00:13:51,080 What if we want to instead 404 00:13:51,790 --> 00:13:52,940 our data constructed where we 405 00:13:53,090 --> 00:13:54,350 have the authors as the 406 00:13:54,540 --> 00:13:56,010 outer elements, and within each 407 00:13:56,390 --> 00:13:58,410 author we list the books that they've written. 408 00:13:59,410 --> 00:14:00,440 Now to understand this query, 409 00:14:00,760 --> 00:14:02,790 you're absolutely going to need to pause the video. 410 00:14:03,400 --> 00:14:05,520 Let me just briefly show what's going on. 411 00:14:06,100 --> 00:14:07,540 But again it would 412 00:14:07,710 --> 00:14:08,680 take quite some time to go 413 00:14:08,880 --> 00:14:10,650 through every detail. 414 00:14:11,720 --> 00:14:12,870 The way we are 415 00:14:12,980 --> 00:14:14,280 going to construct our inverted bookstore 416 00:14:14,960 --> 00:14:16,760 is to find the authors 417 00:14:17,900 --> 00:14:20,480 by last name as the outermost portion of our query. 418 00:14:21,450 --> 00:14:23,040 For each author's last name, 419 00:14:23,480 --> 00:14:24,430 we're going to get their first 420 00:14:24,730 --> 00:14:25,870 names, and then we're 421 00:14:26,020 --> 00:14:27,240 going to return the author 422 00:14:27,740 --> 00:14:29,040 with the first name, the 423 00:14:29,160 --> 00:14:31,060 last name, and next find 424 00:14:31,690 --> 00:14:32,890 all of the books that 425 00:14:33,010 --> 00:14:34,020 were written by that author, 426 00:14:35,020 --> 00:14:36,200 return the book with their 427 00:14:36,360 --> 00:14:37,750 ISBN price title. 428 00:14:38,620 --> 00:14:38,620 Okay? 429 00:14:39,510 --> 00:14:40,760 Again, I urge you to 430 00:14:40,990 --> 00:14:41,920 pause the video and look 431 00:14:42,160 --> 00:14:43,430 very closely at what's going on. 432 00:14:44,010 --> 00:14:45,410 Or even better, download the 433 00:14:45,470 --> 00:14:46,620 data in the query, run it 434 00:14:46,700 --> 00:14:48,810 yourself, and play with it until you understand it. 435 00:14:49,600 --> 00:14:52,170 Let's of the query and we'll see our result. 436 00:14:53,150 --> 00:14:54,420 Just scroll up a little bit here. 437 00:14:54,650 --> 00:14:55,850 And we'll see how the 438 00:14:55,990 --> 00:14:57,720 query did effectively invert the bookstore. 439 00:14:58,530 --> 00:15:01,190 Now we have our authors as the outermost elements. 440 00:15:02,100 --> 00:15:03,450 Within each author, the first 441 00:15:03,700 --> 00:15:05,060 one Jeffrey Ullman, we have 442 00:15:05,320 --> 00:15:06,350 the books that were written by 443 00:15:06,550 --> 00:15:08,280 that author, including their ISBN 444 00:15:08,760 --> 00:15:11,260 and price as attributes and the title as a sub-element. 445 00:15:13,200 --> 00:15:14,390 Now in my original version of 446 00:15:14,510 --> 00:15:15,610 the query, I didn't include 447 00:15:16,120 --> 00:15:17,190 the edition or the remark 448 00:15:17,930 --> 00:15:19,830 since those didn't appear in every book. 449 00:15:20,360 --> 00:15:23,320 But in fact, it doesn't cause any problem at all to add those in. 450 00:15:23,930 --> 00:15:26,110 So let me just add right here the edition. 451 00:15:27,950 --> 00:15:29,270 So we're using dollar B as 452 00:15:29,380 --> 00:15:32,500 our variable for the book and the edition is an attribute. 453 00:15:34,220 --> 00:15:35,340 Again, not every book has 454 00:15:35,490 --> 00:15:36,480 an edition, but we'll see that 455 00:15:36,590 --> 00:15:38,240 it's not going to cause a problem when it doesn't. 456 00:15:39,000 --> 00:15:40,410 And similarly, down here we'll 457 00:15:40,670 --> 00:15:43,190 include the remark, again, 458 00:15:43,520 --> 00:15:44,790 for those books that include them. 459 00:15:45,750 --> 00:15:47,290 Let's run the query 460 00:15:48,380 --> 00:15:49,300 and let's take a look at 461 00:15:49,370 --> 00:15:51,000 our results and see what happened. 462 00:15:51,470 --> 00:15:53,010 So for Jeffrey Ullman's first 463 00:15:53,280 --> 00:15:54,900 book, we got the edition, for 464 00:15:54,980 --> 00:15:55,990 the second book we got the 465 00:15:56,050 --> 00:15:58,460 remark, and everything just worked out just fine. 466 00:15:58,950 --> 00:16:00,570 So, again, there is our inverted bookstore. 467 00:16:02,160 --> 00:16:04,480 That concludes our demonstration of the XQuery language. 468 00:16:05,160 --> 00:16:07,880 I'll again emphasize that it's a very large language, very powerful. 469 00:16:08,440 --> 00:16:10,190 We've seen a number of different constructs. 470 00:16:10,700 --> 00:16:12,260 We've written some pretty interesting queries. 471 00:16:13,000 --> 00:16:14,460 But to fully understand the language, 472 00:16:15,010 --> 00:16:16,090 you'll need to run a bunch 473 00:16:16,290 --> 00:16:18,680 of queries for yourself, and experiment with how it works.