1 00:00:00,280 --> 00:00:01,370 in this video, we'll be 2 00:00:01,460 --> 00:00:02,910 giving a demo of constraint of 3 00:00:03,000 --> 00:00:04,540 several types, as a 4 00:00:04,580 --> 00:00:06,270 reminder constraints also known 5 00:00:06,450 --> 00:00:08,370 as integrity constraints impose restrictions 6 00:00:08,820 --> 00:00:10,010 on the allowable states of a 7 00:00:10,040 --> 00:00:11,410 database, beyond those that 8 00:00:11,540 --> 00:00:12,510 are imposed by the schema 9 00:00:12,960 --> 00:00:14,510 that's been defined and the types of the attributes. 10 00:00:15,890 --> 00:00:17,050 We have a number of different types 11 00:00:17,310 --> 00:00:18,700 of constraints, we have non 12 00:00:19,130 --> 00:00:20,920 null constraints which specified that 13 00:00:21,060 --> 00:00:22,440 a particular attribute cannot have 14 00:00:22,620 --> 00:00:24,090 no values, we have 15 00:00:24,250 --> 00:00:25,320 key constraints that talk about 16 00:00:25,470 --> 00:00:27,180 uniqueness in columns or 17 00:00:27,240 --> 00:00:28,900 sets of columns, we have 18 00:00:29,470 --> 00:00:30,810 attribute base and tuple base 19 00:00:31,090 --> 00:00:32,830 constraints which specify, a 20 00:00:33,290 --> 00:00:34,250 restrictions on the values 21 00:00:34,580 --> 00:00:35,490 and attributes or the values 22 00:00:35,890 --> 00:00:37,140 across attributes in particular 23 00:00:37,570 --> 00:00:38,980 tuples and finally we have 24 00:00:39,270 --> 00:00:40,630 general insertions which are 25 00:00:40,740 --> 00:00:42,370 quite powerful, they allow you 26 00:00:42,500 --> 00:00:44,570 to specify constraints across an entire database. 27 00:00:45,720 --> 00:00:46,930 As we'll see in the demo, not 28 00:00:47,260 --> 00:00:49,170 all of these constraint types are fully implemented. 29 00:00:49,800 --> 00:00:51,570 There are some limits on the 30 00:00:52,470 --> 00:00:53,640 attribute base and tuple base constraints 31 00:00:54,110 --> 00:00:55,470 in systems as compared to 32 00:00:55,530 --> 00:00:57,140 the sequel standard and general 33 00:00:57,540 --> 00:00:58,810 assertions have not been implemented 34 00:00:59,380 --> 00:01:00,590 yet in any database system, 35 00:01:01,000 --> 00:01:04,230 but we will give examples what they look like, had they been implemented. 36 00:01:05,940 --> 00:01:07,230 A very important type of 37 00:01:07,360 --> 00:01:09,220 constraint, is referential integrity or 38 00:01:09,420 --> 00:01:12,280 foreign key constraints and those will be covered in the next video. 39 00:01:13,590 --> 00:01:15,620 For a demonstration of constraints we'll 40 00:01:15,720 --> 00:01:17,200 be returning to the same simple 41 00:01:17,450 --> 00:01:18,700 college admissions database that we 42 00:01:18,800 --> 00:01:20,610 use for our SQL demos, we 43 00:01:20,760 --> 00:01:21,980 have three tables one with 44 00:01:22,140 --> 00:01:23,470 a few colleges, one with 45 00:01:23,610 --> 00:01:24,830 a number of students and finally 46 00:01:25,240 --> 00:01:27,480 a table that has information about students applying to colleges. 47 00:01:28,860 --> 00:01:31,320 Let's start by creating a table with a non-null constraint. 48 00:01:31,930 --> 00:01:33,660 So non-null is a pretty simple type of constraint. 49 00:01:34,170 --> 00:01:35,370 If we decide that our GPA 50 00:01:35,880 --> 00:01:37,000 values in our database must 51 00:01:37,230 --> 00:01:38,090 not take on the null value 52 00:01:38,910 --> 00:01:40,020 when we create the table we 53 00:01:40,180 --> 00:01:41,040 just add the key words 54 00:01:41,380 --> 00:01:43,870 not null in the declaration with that attribute. 55 00:01:45,080 --> 00:01:46,350 Let's run the creation of the 56 00:01:46,420 --> 00:01:47,890 table, let me mention 57 00:01:48,360 --> 00:01:49,260 right up front we're going to 58 00:01:49,320 --> 00:01:50,210 be seeing a lot of this 59 00:01:50,400 --> 00:01:52,240 word affected, this misspelling here 60 00:01:52,360 --> 00:01:54,300 which gets on my nerve but I'm not going to mention it again. 61 00:01:55,320 --> 00:01:56,640 Okay, so let's do some 62 00:01:56,820 --> 00:01:59,540 insertions and updates just to experiment with a not null constraint. 63 00:02:01,080 --> 00:02:02,550 We'll start by asserting three tuples, 64 00:02:03,310 --> 00:02:04,620 the first one has no null 65 00:02:04,890 --> 00:02:06,490 values at all, the second 66 00:02:06,610 --> 00:02:07,850 one has a null value for the 67 00:02:07,950 --> 00:02:09,080 high school size which should 68 00:02:09,310 --> 00:02:10,520 be allowed, and the third 69 00:02:10,860 --> 00:02:11,750 one has a null value 70 00:02:12,110 --> 00:02:13,750 for the GPA which should not be allowed. 71 00:02:14,310 --> 00:02:15,310 Let's run these three insert 72 00:02:15,640 --> 00:02:16,870 commands together and we 73 00:02:17,100 --> 00:02:18,260 see in fact the first two 74 00:02:18,430 --> 00:02:20,570 succeeded where the third one generated an error. 75 00:02:21,340 --> 00:02:22,110 If we go and look at the 76 00:02:22,180 --> 00:02:23,430 table, we'll see that 77 00:02:23,590 --> 00:02:24,800 indeed we got our first 78 00:02:25,110 --> 00:02:26,200 two tuples including the null 79 00:02:26,530 --> 00:02:28,020 for the high school size but there 80 00:02:28,090 --> 00:02:29,540 was no third tuple inserted. 81 00:02:31,000 --> 00:02:32,410 Now we'll try a couple of update commands. 82 00:02:33,790 --> 00:02:34,540 Both of them are going 83 00:02:34,610 --> 00:02:35,840 to set the GPA to null, the 84 00:02:35,980 --> 00:02:37,430 first one for the student with 85 00:02:37,690 --> 00:02:40,530 ID 123 and the second for the student with ID 456. 86 00:02:41,080 --> 00:02:41,980 If we look at our 87 00:02:42,190 --> 00:02:43,240 data, we see that we 88 00:02:43,360 --> 00:02:44,500 do have a student with ID 89 00:02:44,660 --> 00:02:46,390 123, so when we try 90 00:02:46,680 --> 00:02:48,770 to update that GPA to null, we should get an error. 91 00:02:49,200 --> 00:02:50,090 But we don't have a student 92 00:02:50,550 --> 00:02:51,480 whose ID is 456, so 93 00:02:52,900 --> 00:02:53,730 even though we're going to run a 94 00:02:53,760 --> 00:02:55,380 command that tries to set 95 00:02:55,470 --> 00:02:56,740 GPAs to null, because there's no 96 00:02:56,900 --> 00:02:58,470 matching data, no data will 97 00:02:58,710 --> 00:03:00,300 attempt to be updated and we won't get an error. 98 00:03:00,780 --> 00:03:02,210 Let's run the query, the two 99 00:03:02,450 --> 00:03:03,420 updates, and we see 100 00:03:03,680 --> 00:03:04,720 indeed that the first one caused 101 00:03:05,010 --> 00:03:07,220 the constraint violation and the second one did not. 102 00:03:08,620 --> 00:03:09,630 Now let's take a look at key constraints. 103 00:03:10,760 --> 00:03:12,090 I've dropped the previous version of 104 00:03:12,180 --> 00:03:13,360 the student table and now 105 00:03:13,590 --> 00:03:14,530 we will create a new one 106 00:03:14,970 --> 00:03:15,960 where we're declaring the student 107 00:03:16,480 --> 00:03:18,400 ID to be what's called a primary key. 108 00:03:19,180 --> 00:03:20,940 As you may remember, a key 109 00:03:21,350 --> 00:03:22,640 constraint specifies that the 110 00:03:22,710 --> 00:03:23,620 values in the column 111 00:03:24,000 --> 00:03:25,760 that's declared as a key must be unique. 112 00:03:26,590 --> 00:03:27,370 So let's go ahead and create 113 00:03:27,650 --> 00:03:29,090 the table and now let's 114 00:03:29,280 --> 00:03:31,300 experiment with inserting and updating some data. 115 00:03:32,660 --> 00:03:33,970 We'll attempt to insert three students: 116 00:03:34,760 --> 00:03:36,990 first one, 123 Amy; second, 117 00:03:37,460 --> 00:03:39,640 234 Bob and third one, 123 Craig. 118 00:03:40,720 --> 00:03:42,110 Since the third insert will generate 119 00:03:42,670 --> 00:03:44,170 a key violation because there will 120 00:03:44,250 --> 00:03:45,340 be two copies of 123 in 121 00:03:45,520 --> 00:03:48,440 the ID column, that one should generate an error. 122 00:03:49,270 --> 00:03:50,380 We run the queries, the inserts, 123 00:03:51,120 --> 00:03:52,250 and indeed the first two 124 00:03:52,680 --> 00:03:54,570 are fine and the third one has a key error. 125 00:03:55,230 --> 00:03:55,970 If we go and look at the 126 00:03:56,040 --> 00:03:57,400 data itself, we'll see that 127 00:03:57,520 --> 00:03:59,540 the first two are inserted and the third one wasn't. 128 00:04:00,710 --> 00:04:01,800 Now let's take a look at updates. 129 00:04:03,290 --> 00:04:04,610 The first update is very simple. 130 00:04:04,890 --> 00:04:06,620 It tries to set Bob's ID to 123. 131 00:04:06,710 --> 00:04:09,640 Since Amy already has ID 132 00:04:09,940 --> 00:04:11,170 12 three that should generate and 133 00:04:11,390 --> 00:04:12,410 error and when we run 134 00:04:12,640 --> 00:04:14,110 the update command indeed it does. 135 00:04:15,510 --> 00:04:17,260 Now we're going to do something a little bit trickier. 136 00:04:18,860 --> 00:04:19,950 We're gonna run an update command 137 00:04:20,720 --> 00:04:23,050 that subtracts 111 from each 138 00:04:23,240 --> 00:04:24,240 student ID, now you might 139 00:04:24,830 --> 00:04:25,690 wonder why did I choose 140 00:04:26,010 --> 00:04:27,260 111, let's take a 141 00:04:27,360 --> 00:04:28,970 look, if we subtract 142 00:04:29,450 --> 00:04:31,540 111 from Bob's ID; two, 143 00:04:31,720 --> 00:04:32,720 three, four will turn into one, 144 00:04:32,950 --> 00:04:33,970 two, three and will have a 145 00:04:34,010 --> 00:04:35,480 key violation, on the 146 00:04:35,640 --> 00:04:36,460 other hand if the command 147 00:04:37,040 --> 00:04:38,480 first updates Amy's student ID 148 00:04:39,530 --> 00:04:40,760 to 12, then we won't 149 00:04:41,000 --> 00:04:42,120 have a key violation when Bob's 150 00:04:42,510 --> 00:04:44,220 in turned into two three, into one two three. 151 00:04:45,110 --> 00:04:45,950 So whether we get a key 152 00:04:46,460 --> 00:04:47,800 violation in this case could 153 00:04:48,100 --> 00:04:49,420 depend on what order 154 00:04:49,850 --> 00:04:51,790 the system chooses to execute the update. 155 00:04:51,990 --> 00:04:53,780 So let's just run it and let's see what happens. 156 00:04:54,700 --> 00:04:55,620 Well, things look good. 157 00:04:55,840 --> 00:04:56,560 We didn't get an error. 158 00:04:57,240 --> 00:04:59,230 Let's go look back and refresh 159 00:04:59,740 --> 00:05:00,880 the table and we see indeed 160 00:05:01,440 --> 00:05:02,780 that both of the update 161 00:05:03,380 --> 00:05:04,750 succeeded without a violation, 162 00:05:05,740 --> 00:05:06,820 so now let's set the 163 00:05:07,070 --> 00:05:08,810 as the student ID 164 00:05:09,040 --> 00:05:10,020 back to what they were by 165 00:05:10,170 --> 00:05:12,170 adding 111, let's run 166 00:05:12,540 --> 00:05:15,850 it see what happens, well this time we got an error. 167 00:05:16,690 --> 00:05:18,270 So we got a constraint violation 168 00:05:18,830 --> 00:05:20,950 error, a key violation, and nothing was updated. 169 00:05:21,970 --> 00:05:23,600 That's presumably because the 170 00:05:24,020 --> 00:05:26,370 system again updated Amy's 171 00:05:26,760 --> 00:05:27,970 ID first and that generated 172 00:05:28,310 --> 00:05:29,870 an error with the one two three for Amy. 173 00:05:30,770 --> 00:05:31,690 So this sort of demonstrates 174 00:05:32,250 --> 00:05:33,150 it one it can be pretty 175 00:05:33,450 --> 00:05:35,270 tricky when key violations 176 00:05:35,950 --> 00:05:37,600 or other types of constraint violations are 177 00:05:37,670 --> 00:05:38,860 detected and when they 178 00:05:39,060 --> 00:05:40,160 aren't, now we did 179 00:05:40,450 --> 00:05:41,740 mention earlier that there's 180 00:05:41,890 --> 00:05:43,380 a notion of for constraint checking 181 00:05:43,800 --> 00:05:45,160 so if an application have knowledge 182 00:05:45,780 --> 00:05:47,260 that it would rather have constraints checked 183 00:05:47,620 --> 00:05:48,600 after a bunch of changes 184 00:05:49,080 --> 00:05:50,340 rather than in the middle, the 185 00:05:50,510 --> 00:05:51,690 for constraint checking can be 186 00:05:51,770 --> 00:05:53,260 use for that purpose and this 187 00:05:53,480 --> 00:05:55,070 demo we're doing immediate constraint checking. 188 00:05:56,770 --> 00:05:57,740 You might have noticed in the 189 00:05:57,810 --> 00:05:59,040 previous example that I 190 00:05:59,130 --> 00:06:00,940 use the term primary key when 191 00:06:01,070 --> 00:06:02,440 I declared the student ID as a key. 192 00:06:03,590 --> 00:06:04,900 In the SQL standard and in 193 00:06:04,970 --> 00:06:06,410 every database system only one 194 00:06:06,910 --> 00:06:08,110 primary key is allowed per 195 00:06:08,250 --> 00:06:09,260 table, that's why it's called primary, 196 00:06:09,810 --> 00:06:11,180 and often the table will 197 00:06:11,240 --> 00:06:13,040 be organized based on that 198 00:06:13,210 --> 00:06:14,470 key, making it efficient to 199 00:06:14,550 --> 00:06:16,830 do look ups on that, for values for that particular key. 200 00:06:17,540 --> 00:06:18,820 So if we decided 201 00:06:19,200 --> 00:06:20,560 we wanted to declare two primary 202 00:06:21,040 --> 00:06:22,470 keys in our table, the student 203 00:06:22,870 --> 00:06:24,450 ID and the student name, we 204 00:06:24,570 --> 00:06:26,560 would get an error, now 205 00:06:26,980 --> 00:06:27,830 that's not to say we're 206 00:06:27,990 --> 00:06:29,120 not allow to have multiple keys 207 00:06:29,400 --> 00:06:30,310 in a table, in fact we can 208 00:06:30,440 --> 00:06:31,260 have as many as we want, 209 00:06:31,730 --> 00:06:32,620 only one of them can 210 00:06:32,830 --> 00:06:34,420 be declared as primary but we 211 00:06:34,580 --> 00:06:35,750 can declare any number of 212 00:06:35,990 --> 00:06:37,020 attributes or sets of attributes 213 00:06:37,810 --> 00:06:39,180 to be unique and that's again 214 00:06:39,560 --> 00:06:41,250 declaring a key constraint, it 215 00:06:41,380 --> 00:06:42,330 says we can only have one, 216 00:06:43,260 --> 00:06:44,700 we must have unique values 217 00:06:45,300 --> 00:06:46,460 in that column, so let's 218 00:06:46,670 --> 00:06:48,190 create our table with the 219 00:06:48,360 --> 00:06:49,770 student name now also a 220 00:06:49,810 --> 00:06:50,730 key along with the student 221 00:06:51,350 --> 00:06:52,640 ID and we'll do 222 00:06:52,860 --> 00:06:54,730 a few updates just to check that. 223 00:06:56,380 --> 00:06:57,820 So we'll attempt to insert five 224 00:06:58,140 --> 00:06:59,840 students, 1-2-3 Amy, 2-3-4 225 00:07:00,340 --> 00:07:01,590 Bob, so far so good. 226 00:07:02,440 --> 00:07:04,050 When we try 3,4,5 Amy we 227 00:07:04,180 --> 00:07:05,160 should get an error because we 228 00:07:05,260 --> 00:07:06,340 have now declared that the name 229 00:07:06,750 --> 00:07:07,700 must be a key as well 230 00:07:07,980 --> 00:07:10,290 as the student ID so we won't be allowed to have 2 Amy. 231 00:07:11,220 --> 00:07:12,580 4,5,6 door should be good. 232 00:07:12,910 --> 00:07:14,680 5,6,7 Amy should again 233 00:07:14,950 --> 00:07:16,390 generate an error, we ran 234 00:07:16,570 --> 00:07:18,470 the query and indeed we get two errors. 235 00:07:19,850 --> 00:07:21,100 So far we seen only 236 00:07:21,310 --> 00:07:22,870 keys that are one attribute but 237 00:07:23,100 --> 00:07:24,210 as you know we can have keys 238 00:07:24,490 --> 00:07:26,100 that spans several attributes, that's 239 00:07:26,340 --> 00:07:27,380 not to say that each attribute 240 00:07:27,780 --> 00:07:29,160 is the key individually but rather 241 00:07:29,520 --> 00:07:31,030 the combination of values for 242 00:07:31,200 --> 00:07:33,140 all of the attributes must be unique in each tuple. 243 00:07:33,630 --> 00:07:35,200 So let's suppose that our 244 00:07:35,330 --> 00:07:36,640 college name is not unique 245 00:07:36,990 --> 00:07:38,540 on its own but college name 246 00:07:38,780 --> 00:07:40,060 and state together are expected 247 00:07:40,570 --> 00:07:42,580 to be unique, now syntactically we 248 00:07:42,720 --> 00:07:44,180 can't put the primary key 249 00:07:44,330 --> 00:07:45,630 statement with the attribute anymore 250 00:07:46,010 --> 00:07:48,150 because it involves multiple attributes, so 251 00:07:48,230 --> 00:07:49,290 the syntax is to list 252 00:07:49,670 --> 00:07:50,900 the attributes first in the 253 00:07:50,960 --> 00:07:52,620 Create Table command, then use 254 00:07:52,820 --> 00:07:54,200 the keywords primary key and 255 00:07:54,320 --> 00:07:57,210 put the list of attributes constituting the key in parentheses. 256 00:07:58,260 --> 00:07:59,130 So let's create the table. 257 00:08:00,340 --> 00:08:01,250 Now let's insert some data. 258 00:08:02,830 --> 00:08:03,630 I've tried to pick a college 259 00:08:04,050 --> 00:08:05,250 name that's kind of generic Mason, 260 00:08:05,700 --> 00:08:06,490 I don't know if I've succeeded 261 00:08:07,010 --> 00:08:08,010 but we'll try to answer the 262 00:08:08,100 --> 00:08:09,760 Mason college in California, a 263 00:08:10,170 --> 00:08:11,290 Mason college in New York 264 00:08:11,930 --> 00:08:13,280 those should succeed because the 265 00:08:13,350 --> 00:08:14,760 two columns together need to 266 00:08:14,840 --> 00:08:15,930 be unique but not the individual 267 00:08:16,260 --> 00:08:17,270 column and then we should 268 00:08:17,430 --> 00:08:18,310 get an error when we try 269 00:08:18,570 --> 00:08:19,940 to generate a third tuple 270 00:08:20,480 --> 00:08:22,150 with Mason California. We run 271 00:08:22,370 --> 00:08:24,820 the query, we run the inserts and indeed we do. 272 00:08:26,740 --> 00:08:29,190 Now lets use multi-attribute keys to declare some interesting constraints. 273 00:08:30,360 --> 00:08:31,430 We're going to create our apply 274 00:08:31,900 --> 00:08:34,050 table and we're going to have two key constraints. 275 00:08:35,160 --> 00:08:36,430 The first one says that 276 00:08:36,590 --> 00:08:38,010 the combination of student ID 277 00:08:38,180 --> 00:08:40,570 and college name must be unique in each tuple. 278 00:08:41,360 --> 00:08:42,390 What that's really saying is that 279 00:08:42,710 --> 00:08:45,320 each student can apply to each college only one time. 280 00:08:46,330 --> 00:08:47,470 We're also going to say that 281 00:08:47,620 --> 00:08:48,760 the combination of student ID 282 00:08:49,160 --> 00:08:51,770 and major must be unique in each tuple. 283 00:08:51,920 --> 00:08:53,190 That means that each student can 284 00:08:53,400 --> 00:08:55,320 apply to each major only once. 285 00:08:55,870 --> 00:08:56,870 Now, a student can still apply 286 00:08:57,160 --> 00:08:58,300 to several colleges and several 287 00:08:58,620 --> 00:09:00,540 majors, but only one time for each. 288 00:09:01,340 --> 00:09:04,220 So let's create the table and then let's try inserting some data. 289 00:09:06,250 --> 00:09:07,350 We'll insert quite a number of tuples 290 00:09:07,860 --> 00:09:09,620 and lets take a look at what we expect to happen. 291 00:09:10,510 --> 00:09:12,110 Our first tuple says 123 292 00:09:12,240 --> 00:09:14,070 applies to Stanford and CS 293 00:09:14,610 --> 00:09:16,570 and then 123 also applies to Berkeley and EE, no problem. 294 00:09:17,000 --> 00:09:18,590 123 tries to apply 295 00:09:18,970 --> 00:09:20,460 again to Stanford and that 296 00:09:20,650 --> 00:09:21,860 should be an error because 297 00:09:22,210 --> 00:09:23,210 that's the second instance of 298 00:09:23,300 --> 00:09:25,350 the combination of 123 Stanford. 299 00:09:26,160 --> 00:09:28,480 On the other hand, 234 should be able to apply to Stanford. 300 00:09:29,020 --> 00:09:30,810 123 comes back and 301 00:09:30,920 --> 00:09:32,170 wants to go to MIT but 302 00:09:32,330 --> 00:09:34,650 tries once again to major in EE. 303 00:09:34,860 --> 00:09:35,910 That should generate an error because 304 00:09:36,170 --> 00:09:37,370 the combination of 123 and 305 00:09:37,730 --> 00:09:39,810 EE already appears in our second tuple. 306 00:09:40,560 --> 00:09:41,940 And finally 123 applies to 307 00:09:42,120 --> 00:09:44,540 MIT but in biology and that should work just fine. 308 00:09:45,260 --> 00:09:46,420 So we'll run the query and 309 00:09:46,610 --> 00:09:47,750 we'll find indeed the first 310 00:09:48,040 --> 00:09:49,510 two tuples and the fourth 311 00:09:49,770 --> 00:09:51,210 and the sixth were fine but 312 00:09:51,390 --> 00:09:52,840 the third tuple generated an error 313 00:09:53,050 --> 00:09:54,420 because of the second application to 314 00:09:54,490 --> 00:09:55,720 Stanford and the fifth 315 00:09:55,920 --> 00:09:57,400 because of the second application to EE. 316 00:09:58,360 --> 00:09:59,310 Let's go take a look at the data. 317 00:09:59,940 --> 00:10:00,960 And here we see in the 318 00:10:01,180 --> 00:10:02,390 apply relation that we did 319 00:10:02,600 --> 00:10:04,000 indeed insert the four tuples, 320 00:10:04,530 --> 00:10:05,520 but not the two tuples 321 00:10:05,820 --> 00:10:07,060 that generated the key error. 322 00:10:07,790 --> 00:10:09,400 Now we'll try a sneaky update command. 323 00:10:10,570 --> 00:10:11,590 We'll try to take our fourth 324 00:10:12,010 --> 00:10:13,390 tuple, and we'll identify it 325 00:10:13,500 --> 00:10:14,490 by having the college name equal 326 00:10:14,560 --> 00:10:16,060 to MIT, and we'll try to be 327 00:10:16,380 --> 00:10:19,050 sneaky and change the the biology major to CS. 328 00:10:20,070 --> 00:10:21,570 That will then violate the 329 00:10:21,640 --> 00:10:23,020 constraint of the uniqueness 330 00:10:23,590 --> 00:10:25,440 of 123 CS so if 331 00:10:25,600 --> 00:10:26,820 all goes well that update will 332 00:10:26,960 --> 00:10:28,630 be disallow, here is 333 00:10:28,730 --> 00:10:29,960 the update command, setting the 334 00:10:30,040 --> 00:10:31,410 major to CS with the 335 00:10:31,510 --> 00:10:33,070 college name is MIT, rerun 336 00:10:33,420 --> 00:10:35,560 the command and indeed it generates an error. 337 00:10:36,920 --> 00:10:37,880 The last thing we'll show in 338 00:10:37,960 --> 00:10:39,400 this example is how NULL 339 00:10:39,770 --> 00:10:41,520 values work with keys, so 340 00:10:41,630 --> 00:10:42,850 we'll try to insert two tuples 341 00:10:44,040 --> 00:10:44,950 again using 1,2,3, where both 342 00:10:45,440 --> 00:10:47,550 the college name and the major are null. 343 00:10:48,410 --> 00:10:49,790 So, as a reminder, the first 344 00:10:50,020 --> 00:10:51,140 and second attributes need to 345 00:10:51,200 --> 00:10:52,530 be unique in the first and third 346 00:10:52,960 --> 00:10:54,140 attributes need to be 347 00:10:54,240 --> 00:10:55,880 unique, so if NULLs 348 00:10:56,610 --> 00:10:57,860 counts for keys so it 349 00:10:58,030 --> 00:10:59,330 will generate an error, what we'll 350 00:10:59,610 --> 00:11:00,790 see is that we actually don't get 351 00:11:01,000 --> 00:11:02,180 an error and we in 352 00:11:02,470 --> 00:11:04,130 fact do have the data 353 00:11:04,540 --> 00:11:06,350 in the table with 354 00:11:06,560 --> 00:11:08,350 the NULL values, so the 355 00:11:08,470 --> 00:11:10,290 sequel standard and most database 356 00:11:10,720 --> 00:11:12,510 systems do allow repeated 357 00:11:13,000 --> 00:11:14,360 NULL values even in column 358 00:11:14,820 --> 00:11:16,800 that are declared as unique, for 359 00:11:16,930 --> 00:11:18,240 primary key declared columns 360 00:11:18,690 --> 00:11:20,070 most systems though not all, 361 00:11:20,610 --> 00:11:22,680 do not permit repeated NULL values in them. 362 00:11:22,850 --> 00:11:25,410 That completes our demonstration of 363 00:11:25,590 --> 00:11:27,440 key constraints, now let's 364 00:11:27,620 --> 00:11:30,250 look at attribute base check constraints. 365 00:11:31,020 --> 00:11:31,880 Lets create our table again with four students 366 00:11:32,500 --> 00:11:35,450 and this time we'll add two constraints to two of the attributes. 367 00:11:36,460 --> 00:11:37,960 For the GPA we're going to 368 00:11:38,130 --> 00:11:39,950 add the keyword check and a 369 00:11:40,010 --> 00:11:42,600 condition that looks kinda like the where clause in the SQL query. 370 00:11:43,370 --> 00:11:45,420 This condition specifies that GPAs 371 00:11:46,020 --> 00:11:46,790 must be less than or equal 372 00:11:47,690 --> 00:11:49,030 to 4.0 and greater than zero. 373 00:11:50,230 --> 00:11:51,200 We'll also put a check 374 00:11:51,410 --> 00:11:52,430 constraint on the high school 375 00:11:52,730 --> 00:11:53,670 size, saying that the size 376 00:11:53,950 --> 00:11:55,450 of the high school must be less than five thousand. 377 00:11:56,030 --> 00:11:57,660 So these are examples of, 378 00:11:58,100 --> 00:11:59,410 sort of, sanity checks, that are 379 00:11:59,450 --> 00:12:00,990 mostly use for catching data 380 00:12:01,280 --> 00:12:02,530 entry errors, saying that the 381 00:12:02,580 --> 00:12:04,980 attribute values must be within the expected range. 382 00:12:06,210 --> 00:12:09,010 Lets create the table and now we'll take a look at some data. 383 00:12:10,480 --> 00:12:11,760 This time we'll insert two tuples. 384 00:12:12,280 --> 00:12:14,410 It will be pretty easy to see how these constraints work. 385 00:12:15,110 --> 00:12:16,340 The first one inserts Amy 386 00:12:16,540 --> 00:12:17,990 with a reasonable GPA and a 387 00:12:18,440 --> 00:12:20,080 reasonable high school size; the 388 00:12:20,290 --> 00:12:21,720 second one inserts Bob with 389 00:12:22,270 --> 00:12:24,620 a reasonable high school size 390 00:12:24,920 --> 00:12:26,580 but his GPA looks a little out of whack. 391 00:12:27,080 --> 00:12:28,420 We run the query and the 392 00:12:28,610 --> 00:12:30,410 first row is inserted but the second one isn't. 393 00:12:31,400 --> 00:12:34,300 We take a look at the data and we see that Amy has been inserted. 394 00:12:35,500 --> 00:12:36,940 Now to test the constraints on 395 00:12:37,060 --> 00:12:38,510 the size of high school, we'll 396 00:12:38,830 --> 00:12:39,790 try to run an update command 397 00:12:40,270 --> 00:12:42,400 that multiplies all high school sizes by six. 398 00:12:43,840 --> 00:12:46,310 Here's the command, and when we run it, we get an error. 399 00:12:47,490 --> 00:12:49,060 So attribute based constraints allow 400 00:12:49,460 --> 00:12:51,050 us to associate a condition with 401 00:12:51,220 --> 00:12:52,440 a specific attribute and that 402 00:12:52,640 --> 00:12:54,080 condition is checked whenever we 403 00:12:54,270 --> 00:12:55,690 insert a tuple or update 404 00:12:56,040 --> 00:12:57,040 a tuple to make sure 405 00:12:57,400 --> 00:12:59,950 that all of the values in that attribute satisfy the constraint. 406 00:13:01,720 --> 00:13:03,710 A slightly more general notion is tuple based constraints. 407 00:13:04,970 --> 00:13:06,040 Tuple based constraints are also 408 00:13:06,530 --> 00:13:07,550 checked when ever a tuple 409 00:13:07,770 --> 00:13:09,580 is inserted or updated, but 410 00:13:09,770 --> 00:13:10,680 they're allowed to talk about 411 00:13:10,930 --> 00:13:12,740 relationships between different values 412 00:13:13,190 --> 00:13:14,620 in each tuple. 413 00:13:14,920 --> 00:13:16,190 And because we don't associate them 414 00:13:16,360 --> 00:13:18,050 with a specific attribute, the check 415 00:13:18,400 --> 00:13:19,980 itself is put at 416 00:13:20,130 --> 00:13:22,080 the end of the declaration of of the table. 417 00:13:22,500 --> 00:13:23,610 So, we start by declaring all 418 00:13:23,780 --> 00:13:25,500 of the attributes and then afterwards 419 00:13:25,990 --> 00:13:27,090 we put the keyword check again, 420 00:13:27,840 --> 00:13:29,090 and then the condition inside parentheses. 421 00:13:30,350 --> 00:13:31,690 Now this condition may look 422 00:13:32,120 --> 00:13:33,380 at first a little bit odd to you. 423 00:13:33,630 --> 00:13:34,600 It says that for each 424 00:13:35,020 --> 00:13:36,960 apply tuple either the 425 00:13:37,210 --> 00:13:38,480 decision is null or the 426 00:13:38,750 --> 00:13:39,950 college name is not Stanford 427 00:13:40,960 --> 00:13:42,310 or the major is not CS. 428 00:13:43,280 --> 00:13:45,740 Why don't you think about that for a second and think about what it might be saying. 429 00:13:47,900 --> 00:13:48,700 Now if you're good in Boolean 430 00:13:49,160 --> 00:13:50,000 Logic, you might have written 431 00:13:50,200 --> 00:13:52,580 this down using logical expressions and 432 00:13:53,220 --> 00:13:54,270 use some of De Morgan's 433 00:13:54,470 --> 00:13:57,140 laws and turned your or's and not's into implications. 434 00:13:58,420 --> 00:13:59,640 If not, I'll just tell 435 00:13:59,770 --> 00:14:00,610 you that what this is saying 436 00:14:01,020 --> 00:14:01,970 is that there are no 437 00:14:02,560 --> 00:14:03,460 people who have applied 438 00:14:03,870 --> 00:14:06,190 to Stanford and been admitted to CS at Stanford. 439 00:14:07,080 --> 00:14:08,490 Specifically, either they haven't been 440 00:14:08,680 --> 00:14:09,640 admitted or the college is 441 00:14:09,750 --> 00:14:11,620 not Stanford or the major is not CS. 442 00:14:12,740 --> 00:14:15,720 We'll create the table and then we'll experiment with some data. 443 00:14:16,650 --> 00:14:18,270 First we'll try to insert three tuples. 444 00:14:19,250 --> 00:14:20,360 The first one has a student 445 00:14:20,530 --> 00:14:22,000 applying to Stanford CS but 446 00:14:22,140 --> 00:14:23,750 not being admitted, second they 447 00:14:23,870 --> 00:14:25,910 apply to CS but 448 00:14:26,070 --> 00:14:27,340 it says MIT and they are 449 00:14:27,410 --> 00:14:28,730 admitted and then finally will 450 00:14:28,900 --> 00:14:30,680 generate a constraint violation by 451 00:14:30,810 --> 00:14:33,240 having the student apply to Stanford CS and be admitted. 452 00:14:33,910 --> 00:14:35,000 We run the query and, as 453 00:14:35,090 --> 00:14:36,480 expected the first two 454 00:14:36,760 --> 00:14:38,870 tuples are inserted and the third generates a violation. 455 00:14:40,150 --> 00:14:41,240 Now let's try some update statements. 456 00:14:42,100 --> 00:14:43,180 So we have a student who 457 00:14:43,510 --> 00:14:45,530 applied to Stanford CS and was not admitted. 458 00:14:45,850 --> 00:14:47,010 And with a student who 459 00:14:47,160 --> 00:14:49,150 applied to MITCS and was admitted. 460 00:14:50,080 --> 00:14:51,420 So first we'll try to 461 00:14:51,590 --> 00:14:53,080 take that Standford student and 462 00:14:53,190 --> 00:14:54,230 change the decision to yes, 463 00:14:55,520 --> 00:14:56,290 that's not going to work. 464 00:14:56,790 --> 00:14:58,200 So then we'll try taking the 465 00:14:58,830 --> 00:15:00,300 students admission to MIT and 466 00:15:00,920 --> 00:15:01,740 converting that to be an 467 00:15:01,840 --> 00:15:03,970 admission to Stanford and that shouldn't work either. 468 00:15:04,810 --> 00:15:05,880 We try all of those and 469 00:15:06,000 --> 00:15:07,370 neither of them succeed and 470 00:15:07,510 --> 00:15:08,620 both cases are tuple based 471 00:15:08,910 --> 00:15:11,180 constraintless check and the check condition was violated. 472 00:15:12,700 --> 00:15:13,400 Before I do my last 473 00:15:13,650 --> 00:15:14,690 set of examples, I did want 474 00:15:14,900 --> 00:15:17,750 to explain one thing in case you're trying these constraints at home. 475 00:15:17,990 --> 00:15:19,510 The constraints that I've shown 476 00:15:19,820 --> 00:15:21,220 so far were perfectly well in 477 00:15:21,380 --> 00:15:22,720 SQLite and in post risks. 478 00:15:23,520 --> 00:15:24,730 In my SQL as of 479 00:15:24,860 --> 00:15:26,570 the time of this video the 480 00:15:26,750 --> 00:15:28,260 check constraints both the attribute 481 00:15:28,600 --> 00:15:29,720 based and tuple based check constraints 482 00:15:30,870 --> 00:15:31,690 are accepted syntactically[sp?] 483 00:15:32,390 --> 00:15:34,700 by the MySQL system, but they're not enforced. 484 00:15:35,490 --> 00:15:36,170 So it can be a bit 485 00:15:36,330 --> 00:15:37,700 deceptive because you may create 486 00:15:38,110 --> 00:15:39,450 the tables exactly as I've 487 00:15:39,650 --> 00:15:41,160 done in my SQL, but then 488 00:15:41,340 --> 00:15:43,930 you will be allowed to insert enough data and violate the constraints. 489 00:15:45,060 --> 00:15:46,150 So again I recommend for trying 490 00:15:46,450 --> 00:15:47,930 check constraints for now SQLite 491 00:15:48,200 --> 00:15:50,560 or post[xx]. If you've been 492 00:15:50,660 --> 00:15:52,170 a shrewd observer of what 493 00:15:52,340 --> 00:15:53,850 we've done so far, it might 494 00:15:54,030 --> 00:15:55,550 have occurred to you that we had some redundancy. 495 00:15:56,430 --> 00:15:57,850 Specifically, the attribute base 496 00:15:58,280 --> 00:15:59,680 check constraints that we' ve 497 00:15:59,850 --> 00:16:02,380 showed can be used to enforce some other types of constraints. 498 00:16:03,640 --> 00:16:04,880 Very specifically if we want 499 00:16:05,120 --> 00:16:07,830 to have a not null constraint we can just write not null. 500 00:16:08,310 --> 00:16:09,040 That's a built in type of constraint. 501 00:16:09,890 --> 00:16:11,160 But that's equivalent to adding 502 00:16:11,350 --> 00:16:12,480 an attribute based check constraint 503 00:16:13,090 --> 00:16:14,390 that for the GPA for example 504 00:16:14,910 --> 00:16:16,680 checks that the GPA is not null. 505 00:16:17,180 --> 00:16:19,640 As a reminder is not null is a key word in the SQL language. 506 00:16:21,070 --> 00:16:22,340 Let's create this table and 507 00:16:22,990 --> 00:16:25,580 let's try to insert a tuple with a null value. 508 00:16:27,270 --> 00:16:28,640 We have student Amy again with 509 00:16:28,800 --> 00:16:30,840 a null GPA and that generates an error. 510 00:16:31,570 --> 00:16:33,330 A little more challenging and 511 00:16:33,710 --> 00:16:34,790 interesting is to try to 512 00:16:34,880 --> 00:16:37,900 implement key constraints using attribute based check constraints. 513 00:16:38,930 --> 00:16:40,300 So here's an attempt at doing so. 514 00:16:40,930 --> 00:16:42,850 Let's just consider a very simple table. 515 00:16:43,070 --> 00:16:44,070 We'll call it T and it 516 00:16:44,130 --> 00:16:45,610 will have one attribute A. And 517 00:16:46,060 --> 00:16:46,920 we'll try to write a check 518 00:16:47,200 --> 00:16:49,100 constraint that specifies that 519 00:16:49,350 --> 00:16:50,260 A is a key for T. 520 00:16:51,280 --> 00:16:53,260 So, here is my attempt at doing so. 521 00:16:54,000 --> 00:16:55,410 I declare the attribute and 522 00:16:55,530 --> 00:16:56,760 then in at my check I say 523 00:16:57,310 --> 00:16:58,620 that the value of A is 524 00:16:58,760 --> 00:17:00,660 not in select A 525 00:17:00,830 --> 00:17:01,970 from T. In other words 526 00:17:02,340 --> 00:17:03,580 the value of that or say 527 00:17:03,750 --> 00:17:05,280 attempting to insert or update 528 00:17:05,840 --> 00:17:06,960 is unique in table T. 529 00:17:07,750 --> 00:17:08,540 Well first I'm gonna tell you 530 00:17:08,630 --> 00:17:09,650 that I'm not allowed to execute 531 00:17:09,960 --> 00:17:12,370 that, there's various reasons that I can't execute it. 532 00:17:12,450 --> 00:17:13,590 One simple one is that 533 00:17:13,690 --> 00:17:14,770 I'm trying to declare a table 534 00:17:15,140 --> 00:17:16,910 T and refer to it before it has been declared. 535 00:17:18,520 --> 00:17:19,710 Another issue with declaring 536 00:17:20,210 --> 00:17:21,760 it is the sub query in 537 00:17:21,980 --> 00:17:23,730 the check constraint, we'll talk about that in a moment. 538 00:17:24,260 --> 00:17:25,560 There's actually a third problem 539 00:17:25,910 --> 00:17:27,390 with this constraint, which is 540 00:17:27,510 --> 00:17:29,180 we need to think about when it's being checked. 541 00:17:29,900 --> 00:17:31,360 If we say first attempt 542 00:17:31,850 --> 00:17:33,150 to insert the value A 543 00:17:33,430 --> 00:17:35,050 and then check the constraint, then 544 00:17:35,180 --> 00:17:37,740 the constraint will be violated based on the existence of itself. 545 00:17:38,080 --> 00:17:40,760 So this is clearly not going to work. 546 00:17:41,550 --> 00:17:42,420 There is in fact a different 547 00:17:42,810 --> 00:17:44,180 expression that might work 548 00:17:44,620 --> 00:17:46,990 if it weren't for a couple of other obstacles. 549 00:17:48,470 --> 00:17:50,070 Here's an expression that doesn't 550 00:17:50,290 --> 00:17:51,710 have the problem of whether we 551 00:17:51,880 --> 00:17:52,840 check it before or after 552 00:17:53,000 --> 00:17:53,870 we insert A. This is 553 00:17:53,890 --> 00:17:56,390 an expression of a 554 00:17:56,530 --> 00:17:59,220 key Constraint in a way you might not have thought of. 555 00:17:59,760 --> 00:18:01,010 What this says is that 556 00:18:01,220 --> 00:18:02,820 the number of distinct values 557 00:18:03,510 --> 00:18:05,510 for an attribute A must be 558 00:18:05,730 --> 00:18:07,560 equal to the number of tuples in the table. 559 00:18:08,290 --> 00:18:09,800 In other words every tuple has 560 00:18:10,000 --> 00:18:11,520 a distinct value for A. Now 561 00:18:11,650 --> 00:18:12,890 there was one small issue here 562 00:18:13,110 --> 00:18:14,650 which is null values because, as 563 00:18:14,750 --> 00:18:15,970 we mentioned, unique key constraints 564 00:18:16,590 --> 00:18:17,870 allow multiple instances of null. 565 00:18:18,240 --> 00:18:19,400 But, if we don't worry about 566 00:18:19,650 --> 00:18:20,640 nulls then this is expression 567 00:18:21,170 --> 00:18:23,580 really is a different way of saying that A is a key. 568 00:18:24,580 --> 00:18:26,250 We run the query and it doesn't allow it. 569 00:18:26,490 --> 00:18:27,280 Again we have the same 570 00:18:27,590 --> 00:18:28,880 problem that we're referring to 571 00:18:28,970 --> 00:18:30,690 table T within the 572 00:18:31,490 --> 00:18:32,510 check constraint that we're putting 573 00:18:32,760 --> 00:18:34,010 in the definition of table T. 574 00:18:34,860 --> 00:18:36,050 By the way, that can be overcome. 575 00:18:36,550 --> 00:18:37,770 Some systems do allow 576 00:18:38,330 --> 00:18:39,730 constraints to be declared or 577 00:18:39,900 --> 00:18:42,310 added to tables after the table has been specified. 578 00:18:42,910 --> 00:18:43,600 So that would go away. 579 00:18:44,300 --> 00:18:45,560 But no systems that I 580 00:18:45,730 --> 00:18:46,870 know of allow sub queries and 581 00:18:46,970 --> 00:18:49,310 especially not aggregation within check constraints. 582 00:18:50,970 --> 00:18:53,720 Let's pursue a little further the question of subqueries and check constraints. 583 00:18:54,800 --> 00:18:55,960 The key example's a little 584 00:18:56,100 --> 00:18:57,250 bit contrived because of course 585 00:18:57,490 --> 00:18:58,980 we can declare key constraints directly. 586 00:18:59,640 --> 00:19:01,060 But in some cases are 587 00:19:01,200 --> 00:19:02,970 very natural constraint that 588 00:19:03,090 --> 00:19:04,200 we might want to express a 589 00:19:04,320 --> 00:19:05,740 check constraint using sub query. 590 00:19:06,010 --> 00:19:08,320 And I've set up a situation right here. 591 00:19:09,240 --> 00:19:10,650 We create our student table as 592 00:19:10,770 --> 00:19:11,900 usual but when we 593 00:19:12,010 --> 00:19:13,720 create our apply table we 594 00:19:13,820 --> 00:19:14,810 want to have a constraint 595 00:19:15,090 --> 00:19:16,370 that says that any student ID 596 00:19:16,510 --> 00:19:19,250 that appears in the apply table is a valid student. 597 00:19:19,830 --> 00:19:22,780 In other words, there is a student coupled with that student ID. 598 00:19:23,670 --> 00:19:25,730 Now, we can write that as a check constraint. 599 00:19:26,530 --> 00:19:28,400 This is syntactically valid in the SQL standard. 600 00:19:29,220 --> 00:19:30,140 We specify that the student 601 00:19:30,600 --> 00:19:31,740 ID here in the apply table 602 00:19:32,280 --> 00:19:33,200 is in the set of student 603 00:19:33,620 --> 00:19:34,740 IDs in the student table 604 00:19:35,460 --> 00:19:37,290 but currently no SQL system 605 00:19:37,920 --> 00:19:39,650 actually supports sub queries 606 00:19:40,010 --> 00:19:41,440 and check constraints. 607 00:19:42,140 --> 00:19:43,090 Now for this, the civic type of constraint 608 00:19:43,740 --> 00:19:44,740 it happens to fall into 609 00:19:44,970 --> 00:19:45,870 a class that is known as 610 00:19:46,060 --> 00:19:47,890 referential integrity where we 611 00:19:48,110 --> 00:19:49,280 say that this student ID is 612 00:19:49,540 --> 00:19:51,210 referencing a student in the 613 00:19:51,270 --> 00:19:52,710 other, is referencing the 614 00:19:52,780 --> 00:19:53,650 student id in the student 615 00:19:53,990 --> 00:19:55,790 table and therefore any 616 00:19:56,020 --> 00:19:57,200 student id and apply must also 617 00:19:57,560 --> 00:19:58,990 exist in the student and 618 00:19:59,050 --> 00:20:02,050 in another video we will referential integrity in some detail. 619 00:20:02,850 --> 00:20:04,800 But, not every check 620 00:20:05,020 --> 00:20:06,140 constraint with a subquery falls 621 00:20:06,490 --> 00:20:08,110 in the class of referential integrity constraints. 622 00:20:08,680 --> 00:20:10,000 The example I gave for 623 00:20:10,250 --> 00:20:12,440 keys doesn't and neither does the one here. 624 00:20:13,080 --> 00:20:15,270 Now, this is admittedly a little contrived. 625 00:20:16,200 --> 00:20:17,640 But what this says is that 626 00:20:18,160 --> 00:20:20,400 every college's enrollment must be 627 00:20:20,550 --> 00:20:22,040 bigger than any high school and 628 00:20:22,370 --> 00:20:23,590 so we write that by writing 629 00:20:23,910 --> 00:20:24,890 the check constraint in the college 630 00:20:25,430 --> 00:20:26,780 table that the enrollment is 631 00:20:26,920 --> 00:20:28,250 greater than the maximum high 632 00:20:28,470 --> 00:20:29,830 school size from the student table. 633 00:20:30,990 --> 00:20:33,730 Now, again, no system currently will support this. 634 00:20:34,260 --> 00:20:35,880 However, it is in the SQL standard. 635 00:20:36,920 --> 00:20:38,000 Now one thing I want to 636 00:20:38,220 --> 00:20:39,590 mention about check constraints with 637 00:20:39,770 --> 00:20:41,640 subqueries is that they can be kind of deceptive. 638 00:20:42,330 --> 00:20:44,390 And we can take a look at the apply table again. 639 00:20:45,320 --> 00:20:47,520 Supposing this was in fact supported by a system. 640 00:20:48,120 --> 00:20:49,620 It would check whenever we inserted 641 00:20:50,260 --> 00:20:51,590 a tuple into apply or it 642 00:20:51,860 --> 00:20:53,350 updated a student ID in apply 643 00:20:53,740 --> 00:20:55,160 that the constraint holds. 644 00:20:56,040 --> 00:20:57,290 But what it will not check 645 00:20:57,740 --> 00:20:59,150 is when things change in student. 646 00:20:59,950 --> 00:21:01,240 So we could write this constraint 647 00:21:01,950 --> 00:21:02,850 and every time we do 648 00:21:02,950 --> 00:21:03,950 an insert or update and apply 649 00:21:04,240 --> 00:21:05,510 it could be verified but somebody 650 00:21:05,940 --> 00:21:07,030 could go and change the student 651 00:21:07,420 --> 00:21:08,570 table and delete a 652 00:21:08,700 --> 00:21:10,070 student ID and then what 653 00:21:10,270 --> 00:21:11,540 we feel as the constraint here 654 00:21:11,980 --> 00:21:13,440 is no longer actually holding. 655 00:21:13,970 --> 00:21:15,820 So it can be tricky to use those subqueries. 656 00:21:16,830 --> 00:21:17,650 Again, when we do referential 657 00:21:18,350 --> 00:21:19,440 integrity as in this example, 658 00:21:20,350 --> 00:21:21,770 the referential integrity system will 659 00:21:21,910 --> 00:21:23,380 take care of making sure the constraint holds. 660 00:21:24,030 --> 00:21:24,780 But when we have an example 661 00:21:25,260 --> 00:21:26,390 like the one with the enrollments, 662 00:21:27,180 --> 00:21:28,810 if we say change the 663 00:21:28,940 --> 00:21:30,210 high school size in the 664 00:21:30,290 --> 00:21:31,700 student table, it would 665 00:21:31,910 --> 00:21:33,540 not activate this constraint checking 666 00:21:34,180 --> 00:21:35,650 that's specified with the college table. 667 00:21:36,860 --> 00:21:39,400 The last type of constraint I am going to show are general assertions. 668 00:21:40,650 --> 00:21:41,950 General assertions are very powerful 669 00:21:42,710 --> 00:21:43,840 and they are in the SQL standard, 670 00:21:44,600 --> 00:21:46,190 but unfortunately they currently are 671 00:21:46,340 --> 00:21:47,870 not supported by any database system. 672 00:21:48,890 --> 00:21:50,000 The first assertion I'm going to 673 00:21:50,110 --> 00:21:51,370 write is coming back to 674 00:21:51,440 --> 00:21:52,910 the issue of trying to 675 00:21:52,990 --> 00:21:53,950 declare or trying to enforce 676 00:21:54,490 --> 00:21:56,360 a key constraint without using the built-in facilities. 677 00:21:57,860 --> 00:21:59,670 Let me just write the command here. 678 00:21:59,950 --> 00:22:02,060 It says we're going to create a assertion called key. 679 00:22:03,120 --> 00:22:06,060 Notice that this assertion is not associated with a specific table. 680 00:22:06,790 --> 00:22:08,020 Although this assertion only talks 681 00:22:08,360 --> 00:22:10,090 about table T Assertions can 682 00:22:10,280 --> 00:22:12,040 refer to any number of tables in the database. 683 00:22:13,420 --> 00:22:14,480 The way an assertion works is 684 00:22:14,570 --> 00:22:15,830 we create an assertion and 685 00:22:15,890 --> 00:22:16,920 we give it a name, and the 686 00:22:17,070 --> 00:22:19,110 reason for this is so we can delete it later if we wish. 687 00:22:19,880 --> 00:22:20,940 Then the keyword check appears. 688 00:22:21,740 --> 00:22:22,610 And then we write a condition. 689 00:22:23,180 --> 00:22:24,410 And the conditions can be quite complicated. 690 00:22:25,290 --> 00:22:26,920 The assertion is saying that 691 00:22:27,160 --> 00:22:28,810 this condition, written in SQL-like 692 00:22:29,260 --> 00:22:31,630 language must always be true on the database. 693 00:22:32,040 --> 00:22:33,970 So the particular condition that 694 00:22:34,100 --> 00:22:35,230 I've put in this first example 695 00:22:35,570 --> 00:22:36,880 is a condition we use to 696 00:22:37,080 --> 00:22:38,850 check whether attribute A is 697 00:22:38,970 --> 00:22:40,480 a key in table T. It 698 00:22:40,620 --> 00:22:41,770 says that the number of 699 00:22:41,890 --> 00:22:43,340 distinct values in attribute 700 00:22:43,670 --> 00:22:44,700 A must be equal 701 00:22:45,050 --> 00:22:46,190 to the number tuples in T. 702 00:22:46,900 --> 00:22:47,670 Now I can try to run 703 00:22:47,920 --> 00:22:49,890 this, but I guarantee you that it's not supported. 704 00:22:51,190 --> 00:22:51,990 Let's look at some other example 705 00:22:52,460 --> 00:22:54,310 assertions we might write if they were supported. 706 00:22:55,290 --> 00:22:56,500 Here's an example that implements 707 00:22:57,050 --> 00:22:59,470 this referential integrity that I was describing earlier. 708 00:23:00,850 --> 00:23:03,220 This referential integrity constraint is 709 00:23:03,320 --> 00:23:04,640 saying that the student IDs 710 00:23:05,070 --> 00:23:07,350 in the apply table must also exist in the student table. 711 00:23:08,160 --> 00:23:09,270 Now when we write an assertion 712 00:23:09,990 --> 00:23:11,190 of that form, we tend to 713 00:23:11,550 --> 00:23:12,350 often write it in the negative 714 00:23:12,980 --> 00:23:14,910 form, specifically we say that 715 00:23:15,070 --> 00:23:16,920 it's not the case that something bad happens. 716 00:23:17,800 --> 00:23:19,270 It's not the case that 717 00:23:19,400 --> 00:23:20,490 there's some tuple in apply 718 00:23:21,330 --> 00:23:24,110 where the student ID is not in the student table. 719 00:23:25,010 --> 00:23:26,040 You can try on your 720 00:23:26,160 --> 00:23:27,090 own to write this in a 721 00:23:27,230 --> 00:23:29,120 more positive fashion, but you'll 722 00:23:29,300 --> 00:23:31,740 actually find that using SQL constructs it's not possible. 723 00:23:32,080 --> 00:23:33,730 It's actually very common 724 00:23:34,390 --> 00:23:35,850 for assertions to specify the 725 00:23:35,940 --> 00:23:38,580 bad thing in a subquery and then write not exists. 726 00:23:39,530 --> 00:23:42,030 As a final example, let's 727 00:23:42,270 --> 00:23:43,700 suppose that we require that 728 00:23:43,890 --> 00:23:45,280 the average GPA of students 729 00:23:45,770 --> 00:23:47,660 who are accepted to college is greater than 3. 730 00:23:47,820 --> 00:23:47,820 0. 731 00:23:48,360 --> 00:23:49,270 We can write that as 732 00:23:49,370 --> 00:23:51,380 an assertion pretty much exactly 733 00:23:51,960 --> 00:23:52,900 as I just described it. 734 00:23:53,500 --> 00:23:55,100 We take the average GPA of 735 00:23:55,170 --> 00:23:56,870 students where their ID is 736 00:23:57,030 --> 00:23:58,170 among the IDs in 737 00:23:58,330 --> 00:23:59,420 the apply relation where the 738 00:23:59,530 --> 00:24:01,280 decision was yes, and our 739 00:24:01,450 --> 00:24:02,440 assertion states that that average 740 00:24:03,070 --> 00:24:04,000 must be greater than 3.0 741 00:24:05,660 --> 00:24:08,180 So so far I've described how assertions are created. 742 00:24:08,680 --> 00:24:09,960 Let me just briefly mention how 743 00:24:10,280 --> 00:24:12,660 they are checked or how they would be checked if they were implemented. 744 00:24:13,940 --> 00:24:15,250 Any system that implements this 745 00:24:15,400 --> 00:24:16,620 very general form of assertion 746 00:24:17,630 --> 00:24:19,650 must determine every possible change 747 00:24:20,030 --> 00:24:21,910 to the database that could violate the assertion. 748 00:24:22,890 --> 00:24:24,340 In this case, modifying a GPA, 749 00:24:25,440 --> 00:24:26,950 modifying a student ID, inserting 750 00:24:27,930 --> 00:24:29,270 or deleting from students or 751 00:24:29,380 --> 00:24:31,830 apply could all potentially violate the constraint. 752 00:24:33,020 --> 00:24:34,430 And in that case, after each of 753 00:24:34,560 --> 00:24:36,710 those types of modifications to 754 00:24:36,780 --> 00:24:37,940 the database, the system would 755 00:24:38,060 --> 00:24:39,010 need to check the constraint, 756 00:24:39,580 --> 00:24:40,660 make sure that it's still satisfied, 757 00:24:41,350 --> 00:24:42,510 and if not, generate an error 758 00:24:42,890 --> 00:24:44,120 and disallow the database change. 759 00:24:45,940 --> 00:24:47,480 So I've only talked about creating assertions. 760 00:24:48,050 --> 00:24:49,050 Let me just talk very briefly 761 00:24:49,580 --> 00:24:50,760 about how a system would 762 00:24:51,080 --> 00:24:52,580 enforce general assertions of 763 00:24:52,690 --> 00:24:54,110 this form if it supported them. 764 00:24:55,340 --> 00:24:56,680 What the system needs to do, is 765 00:24:56,860 --> 00:24:58,640 monitor every possible change 766 00:24:59,210 --> 00:25:01,730 to the database that could cause the assertion to become violated. 767 00:25:02,860 --> 00:25:03,580 So we take a look at 768 00:25:03,640 --> 00:25:05,200 this particular assertion, it 769 00:25:05,290 --> 00:25:08,060 could become violated if we changed the student GPA. 770 00:25:08,770 --> 00:25:10,070 If we inserted a student, 771 00:25:10,300 --> 00:25:11,520 even if we deleted a student, 772 00:25:12,380 --> 00:25:13,430 or if we inserted an application 773 00:25:14,450 --> 00:25:15,230 that was now having a decision 774 00:25:15,700 --> 00:25:17,640 of yes, or updated the application status. 775 00:25:18,450 --> 00:25:19,450 So all of those changes 776 00:25:19,700 --> 00:25:20,690 have to be monitored by the 777 00:25:20,770 --> 00:25:22,000 system, the constraint has to 778 00:25:22,130 --> 00:25:23,420 be checked after the change, and 779 00:25:23,650 --> 00:25:24,660 if the constraint is no longer 780 00:25:24,920 --> 00:25:26,090 satisfied, an error is 781 00:25:26,210 --> 00:25:27,640 generated, and the change is undone. 782 00:25:29,940 --> 00:25:31,160 That concludes our discussion of constraints 783 00:25:31,580 --> 00:25:32,630 with the exception of referential 784 00:25:33,200 --> 00:25:34,420 integrity which is covered in 785 00:25:34,530 --> 00:25:35,810 a separate video, and the 786 00:25:36,010 --> 00:25:38,710 related topic of triggers, which is also covered in a separate video.