1 00:00:00,390 --> 00:00:01,420 This video continues the topic 2 00:00:01,800 --> 00:00:03,300 of relational design, talking specifically 3 00:00:03,920 --> 00:00:06,400 about multivalued dependencies and Fourth Normal Form. 4 00:00:07,170 --> 00:00:08,420 I know I've reminded you many times 5 00:00:08,820 --> 00:00:10,100 about relational designed by decomposition, 6 00:00:10,870 --> 00:00:11,780 so let me do it very quickly. 7 00:00:12,650 --> 00:00:14,140 The designer specifies mega relations 8 00:00:14,670 --> 00:00:15,720 with all the information they 9 00:00:15,800 --> 00:00:17,390 want to store and properties of the data. 10 00:00:18,140 --> 00:00:19,470 The system decomposes the mega 11 00:00:19,690 --> 00:00:21,570 relations into smaller relations that 12 00:00:21,690 --> 00:00:24,210 have good properties -- no anomalies and no lost information. 13 00:00:25,370 --> 00:00:27,220 When we have functional dependencies, as 14 00:00:27,350 --> 00:00:28,760 are properties of the data, we 15 00:00:28,940 --> 00:00:30,830 get Boyce-Codd Normal form, and 16 00:00:30,940 --> 00:00:32,110 then when we add to the 17 00:00:32,200 --> 00:00:33,760 functional dependencies multi-value dependencies 18 00:00:34,670 --> 00:00:35,650 we get fourth normal form. 19 00:00:36,380 --> 00:00:38,180 And the specification of multi- 20 00:00:38,510 --> 00:00:40,270 value dependencies and decomposition into 21 00:00:40,530 --> 00:00:42,330 Fourth Normal Form is the topic of this video. 22 00:00:43,430 --> 00:00:44,250 As a reminder from earlier, 23 00:00:44,990 --> 00:00:46,080 Fourth Normal Form is stronger 24 00:00:46,860 --> 00:00:48,330 than Boyce-Codd Normal Form so 25 00:00:48,590 --> 00:00:50,630 if we have here all of 26 00:00:50,800 --> 00:00:51,620 the relations that are in Boyce- 27 00:00:51,990 --> 00:00:53,640 Codd normal form, some subset 28 00:00:54,230 --> 00:00:56,710 of those are also in fourth normal form. 29 00:00:59,980 --> 00:01:01,130 When we have functional dependencies, we can guarantee Boyce Codd normal 30 00:01:01,570 --> 00:01:02,560 form and then when we add 31 00:01:03,030 --> 00:01:04,780 multi-value dependencies that's what 32 00:01:04,890 --> 00:01:06,100 allows us to narrow down to 33 00:01:06,140 --> 00:01:07,750 the stronger property of fourth normal form. 34 00:01:08,630 --> 00:01:09,310 So let's start with an example. 35 00:01:10,080 --> 00:01:12,140 We have information about students applying to colleges. 36 00:01:13,170 --> 00:01:15,280 The student is identified by their social security number. 37 00:01:15,990 --> 00:01:17,120 They may apply to several colleges 38 00:01:17,650 --> 00:01:18,640 and in this video we're not going 39 00:01:18,800 --> 00:01:20,530 to have college states, just college names. 40 00:01:20,900 --> 00:01:21,490 We'll assume they're unique. 41 00:01:22,560 --> 00:01:23,220 And then the student may have hobbies. 42 00:01:24,160 --> 00:01:25,150 And they may apply, as I've said, 43 00:01:25,410 --> 00:01:26,590 to several colleges and have several 44 00:01:26,960 --> 00:01:28,720 hobbies, but let's assume for now those are independent. 45 00:01:29,910 --> 00:01:32,430 So do we have any functional dependencies for this relation? 46 00:01:33,590 --> 00:01:34,770 Actually we don't have any all. 47 00:01:35,370 --> 00:01:36,800 The social security number does 48 00:01:37,090 --> 00:01:38,250 not determine the college name 49 00:01:38,500 --> 00:01:40,170 or the hobby or anything in the other direction. 50 00:01:41,480 --> 00:01:43,190 With no functional dependencies, the 51 00:01:43,360 --> 00:01:44,200 only key for the relation 52 00:01:44,690 --> 00:01:46,380 then is all attributes of the relation. 53 00:01:47,510 --> 00:01:49,470 So is this relation in Boyce-Codd Normal Form? 54 00:01:50,160 --> 00:01:51,070 As you might remember from the 55 00:01:51,140 --> 00:01:52,590 previous video, Boyce-Codd 56 00:01:52,820 --> 00:01:54,050 Normal Form says all functional 57 00:01:54,410 --> 00:01:56,150 dependencies have a key on the left hand side. 58 00:01:56,430 --> 00:01:59,290 Well, since we have no functional dependencies then the answer is yes. 59 00:01:59,940 --> 00:02:01,370 It is in Boyce-Codd normal form. 60 00:02:02,670 --> 00:02:04,280 On the other hand do we think this is a good design? 61 00:02:04,750 --> 00:02:07,720 I'm going to say no this is not a good design Why not? 62 00:02:08,070 --> 00:02:09,470 Well, let's suppose that somebody 63 00:02:09,980 --> 00:02:11,220 applies to five colleges 64 00:02:12,590 --> 00:02:14,350 and they have, say, six hobbies. 65 00:02:14,920 --> 00:02:16,010 Then to have all 66 00:02:16,210 --> 00:02:17,930 combinations of colleges and 67 00:02:18,200 --> 00:02:19,530 hobbies that would yield 30 68 00:02:19,970 --> 00:02:22,520 tuples in the relation and clearly that's not a good idea. 69 00:02:22,860 --> 00:02:24,270 We'd rather separate the college 70 00:02:24,650 --> 00:02:26,110 and hobbies if they are independent. 71 00:02:27,530 --> 00:02:29,300 So the separation of independent facts 72 00:02:29,990 --> 00:02:31,420 is what fourth normal form is about. 73 00:02:31,700 --> 00:02:33,150 And now let's get a little bit more formal. 74 00:02:34,190 --> 00:02:36,120 Like functional dependencies, multivalued dependencies 75 00:02:36,900 --> 00:02:38,260 are specified based on knowledge 76 00:02:38,710 --> 00:02:40,080 of the real world constraints on 77 00:02:40,220 --> 00:02:41,810 the data being captured and all 78 00:02:42,100 --> 00:02:43,340 instances of a relation 79 00:02:43,610 --> 00:02:46,040 with a multivalued dependency must adhere to the dependency. 80 00:02:47,070 --> 00:02:49,690 Now let's define exactly what a multi value dependency is. 81 00:02:50,850 --> 00:02:52,240 For relation R we write 82 00:02:52,480 --> 00:02:54,060 a multi value dependency using 83 00:02:54,340 --> 00:02:56,280 a double headed arrow and 84 00:02:56,460 --> 00:02:58,430 we say 'A' multi determines 'B'. 85 00:02:58,810 --> 00:03:00,420 In this case, again, with 'A' 86 00:03:00,710 --> 00:03:02,400 and 'B' possibly being sets of attributes, so that would be A 87 00:03:03,400 --> 00:03:04,410 one through A N 88 00:03:05,380 --> 00:03:06,440 and B one through B 89 00:03:06,650 --> 00:03:08,850 M, which I'm abbreviating with A bar and B bar. 90 00:03:10,740 --> 00:03:12,180 So let me write the formal definition of A multi 91 00:03:12,620 --> 00:03:14,200 determines B. Again using first 92 00:03:14,440 --> 00:03:15,750 order logic similarly to what 93 00:03:15,910 --> 00:03:18,150 we did with functional dependencies but this one's a bit more complicated. 94 00:03:19,730 --> 00:03:21,200 It says for all tuples T and 95 00:03:21,340 --> 00:03:22,470 U that are in relation 96 00:03:23,150 --> 00:03:25,110 R, if T with 97 00:03:25,590 --> 00:03:27,650 the attributes A of T equal U 98 00:03:28,450 --> 00:03:30,990 for the attributes A of U. Again these are lists of attributes. 99 00:03:31,220 --> 00:03:33,180 So if the two tuples agree on 100 00:03:33,750 --> 00:03:35,870 their A values then, and 101 00:03:36,140 --> 00:03:37,550 remember for functional dependencies it was 102 00:03:37,730 --> 00:03:39,490 simple we just said they agreed on their B values. 103 00:03:40,260 --> 00:03:41,170 But now it gets more complicated. 104 00:03:41,820 --> 00:03:42,810 We're going to say that there exists 105 00:03:44,080 --> 00:03:47,390 a third tuple V in R that has the following properties. 106 00:03:48,400 --> 00:03:49,820 V has the same A 107 00:03:50,110 --> 00:03:51,730 values as T and 108 00:03:51,940 --> 00:03:53,190 U. So V sub A equals 109 00:03:53,540 --> 00:03:57,100 T sub A, furthermore V 110 00:03:57,500 --> 00:03:59,450 has its B value, 111 00:03:59,890 --> 00:04:03,130 okay, drawn from T 112 00:04:03,690 --> 00:04:04,750 so it's equal there. 113 00:04:06,010 --> 00:04:07,770 And finally it has its 114 00:04:08,380 --> 00:04:10,150 rest, so those are 115 00:04:10,260 --> 00:04:11,310 all the attributes other than 116 00:04:11,550 --> 00:04:14,790 A and B equal to U rest. 117 00:04:15,690 --> 00:04:18,100 Okay, so that's a mouthful but let's look at that pictorially. 118 00:04:19,580 --> 00:04:20,930 So here's our relation R and 119 00:04:21,100 --> 00:04:22,050 we'll have the set of 120 00:04:22,130 --> 00:04:23,500 attributes A, the set of 121 00:04:23,670 --> 00:04:25,740 attributes B and the rest of the attributes. 122 00:04:26,330 --> 00:04:27,710 And now let's make some tuples. 123 00:04:28,690 --> 00:04:29,620 So let's say that this 124 00:04:29,740 --> 00:04:30,910 is tuple T and this 125 00:04:31,070 --> 00:04:32,390 is tuple U. And we 126 00:04:32,520 --> 00:04:33,600 said that T and U 127 00:04:33,930 --> 00:04:35,120 agree on their A values. 128 00:04:35,640 --> 00:04:36,670 So they have the same A values 129 00:04:37,540 --> 00:04:39,550 and then they don't have to have the same B values. 130 00:04:39,920 --> 00:04:41,020 So we'll call the first one B-1 131 00:04:41,430 --> 00:04:42,620 and the second one B-2 and 132 00:04:43,260 --> 00:04:44,950 then for the rest we'll call this R-1 and R-2. 133 00:04:45,110 --> 00:04:47,790 So what the multi -value dependency 134 00:04:48,360 --> 00:04:49,640 says is that we have a 135 00:04:49,690 --> 00:04:51,820 third tuple, V and 136 00:04:51,960 --> 00:04:53,120 V again has the same 137 00:04:53,660 --> 00:04:55,170 A and it has 138 00:04:55,610 --> 00:04:56,970 its B value from tuple 139 00:04:57,350 --> 00:04:58,290 T. So it has B-1, 140 00:04:59,430 --> 00:05:00,650 but it has its rest value 141 00:05:01,290 --> 00:05:03,020 from tuple U, so then 142 00:05:03,260 --> 00:05:04,670 it has R-2 here. 143 00:05:05,630 --> 00:05:06,650 So again what we're saying is 144 00:05:06,750 --> 00:05:08,010 that if we have these first 145 00:05:08,160 --> 00:05:09,090 two tuples T and U, 146 00:05:09,370 --> 00:05:10,690 then we also have tuple 147 00:05:11,110 --> 00:05:13,160 V. Now let me do something a little tricky. 148 00:05:13,580 --> 00:05:14,690 Let me swap the roles of 149 00:05:14,840 --> 00:05:15,970 T and U and show 150 00:05:16,250 --> 00:05:17,580 that we also with this definition, 151 00:05:18,340 --> 00:05:19,460 are guaranteed to have a 152 00:05:19,630 --> 00:05:20,830 fourth tuple and we'll call 153 00:05:21,080 --> 00:05:23,330 that fourth tuple W. By 154 00:05:23,520 --> 00:05:24,680 swapping the roles of 155 00:05:24,800 --> 00:05:27,580 T and U, W has again the same A value. 156 00:05:28,440 --> 00:05:29,780 Now it will take its B 157 00:05:29,800 --> 00:05:30,830 value from U and that 158 00:05:30,920 --> 00:05:32,230 will give us B2, and 159 00:05:32,460 --> 00:05:34,020 we'll take its rest 160 00:05:35,000 --> 00:05:37,710 value from T and that gives us R1. 161 00:05:37,920 --> 00:05:39,680 So what we can see 162 00:05:40,050 --> 00:05:41,110 here is that when we 163 00:05:41,380 --> 00:05:42,780 have the first two tuples 164 00:05:43,440 --> 00:05:45,660 that have this particular combination of 165 00:05:45,760 --> 00:05:47,310 B values and rest values, 166 00:05:47,440 --> 00:05:48,760 it tells us we 167 00:05:48,960 --> 00:05:51,130 must have the other combinations as well. 168 00:05:51,750 --> 00:05:53,150 We must have B1 with R2, 169 00:05:54,100 --> 00:05:54,950 and B2 with R1. 170 00:05:55,110 --> 00:05:56,580 What it's really saying 171 00:05:56,800 --> 00:05:58,270 is those B values and 172 00:05:58,370 --> 00:06:01,250 the rest values are independent of each other and we'll have all combinations. 173 00:06:02,290 --> 00:06:03,200 So that might get you thinking 174 00:06:03,600 --> 00:06:05,370 back to our colleges and hobbies. 175 00:06:06,650 --> 00:06:08,700 Incidentally, sometimes multi-value dependencies 176 00:06:09,600 --> 00:06:11,260 are called tuple generating dependencies. 177 00:06:12,370 --> 00:06:14,010 And that's because the definition is 178 00:06:14,390 --> 00:06:16,030 is about having additional tuples 179 00:06:16,680 --> 00:06:18,080 when you have some existing tuples, 180 00:06:18,170 --> 00:06:20,130 unlike functional dependencies which 181 00:06:20,360 --> 00:06:22,720 just talk about the relationships among existing tuples. 182 00:06:23,760 --> 00:06:24,860 So let's go back to our example. 183 00:06:25,170 --> 00:06:27,910 Now we have students applying to colleges and having hobbies. 184 00:06:28,600 --> 00:06:30,580 Those are independent facts about the student. 185 00:06:31,440 --> 00:06:32,960 We'll write our multi-value dependency 186 00:06:33,870 --> 00:06:35,400 as 'social security number multi 187 00:06:36,100 --> 00:06:37,340 determine C name' and 188 00:06:37,690 --> 00:06:38,590 now lets use some example 189 00:06:39,170 --> 00:06:41,590 data to see our definition and how it works here. 190 00:06:42,910 --> 00:06:44,320 Here's our apply relation with the 191 00:06:44,380 --> 00:06:45,610 social security number, the college 192 00:06:46,060 --> 00:06:47,170 name and the hobby. 193 00:06:48,490 --> 00:06:50,340 Let's suppose that we have 194 00:06:51,710 --> 00:06:52,750 a student, 123 who's applied 195 00:06:53,220 --> 00:06:55,760 to Stanford and plays the trumpet. 196 00:06:57,290 --> 00:06:58,870 Now, let's suppose that same 197 00:06:59,230 --> 00:07:00,910 student, 123, has applied 198 00:07:01,630 --> 00:07:03,280 to Berkeley and plays tennis. 199 00:07:04,640 --> 00:07:06,750 So what our multivalued dependency 200 00:07:07,320 --> 00:07:08,340 says, and let's make this 201 00:07:08,540 --> 00:07:09,620 tuple T and tuple U, 202 00:07:10,570 --> 00:07:11,710 is that there's a further tuple 203 00:07:12,170 --> 00:07:13,910 V. V takes the 204 00:07:14,110 --> 00:07:15,740 same social security number and 205 00:07:15,880 --> 00:07:18,000 it takes the first value 206 00:07:18,700 --> 00:07:19,920 for the college name and 207 00:07:20,060 --> 00:07:21,430 the second for the hobby. 208 00:07:22,390 --> 00:07:24,290 It says if we have 209 00:07:24,480 --> 00:07:25,770 a 123 playing trumpet at Stanford 210 00:07:26,230 --> 00:07:27,910 and tennis at Berkeley, then that 211 00:07:28,080 --> 00:07:29,880 same person will be playing tennis at Stanford. 212 00:07:31,000 --> 00:07:33,850 Furthermore, I show that the same definition will generate automatically. 213 00:07:34,720 --> 00:07:36,040 A fourth tuple with the 214 00:07:36,160 --> 00:07:38,540 other combination which would be Berkley and Trumpet. 215 00:07:39,690 --> 00:07:40,520 By the way one thing you 216 00:07:40,630 --> 00:07:41,640 might notice here is that 217 00:07:41,840 --> 00:07:43,470 we also have the multivalued 218 00:07:43,870 --> 00:07:46,860 dependency, social security number multi determines hobby. 219 00:07:47,430 --> 00:07:48,780 This is actually one of 220 00:07:48,840 --> 00:07:50,780 the rules for multivalued dependency saying 221 00:07:51,010 --> 00:07:52,650 that when you have A determines 222 00:07:53,250 --> 00:07:54,800 B, then you, A multidetermines 223 00:07:55,590 --> 00:07:56,940 B, then you also have A 224 00:07:57,140 --> 00:07:58,570 multi determines rest and we'll 225 00:07:58,730 --> 00:08:00,630 see some rules for multivalued dependencies later. 226 00:08:01,770 --> 00:08:02,650 Let's look quickly at a 227 00:08:03,040 --> 00:08:04,180 modification of our example where 228 00:08:04,340 --> 00:08:06,490 the real world assumptions about the data are different. 229 00:08:07,270 --> 00:08:09,960 So we still have exactly the same relation with the same attributes. 230 00:08:10,680 --> 00:08:11,690 But let's suppose that we don't 231 00:08:12,200 --> 00:08:14,220 want to reveal every hobby to every college. 232 00:08:14,660 --> 00:08:15,630 Maybe we'll decide that we don't 233 00:08:15,880 --> 00:08:16,850 want Stanford to know that 234 00:08:16,980 --> 00:08:18,170 we're a surfer or Berkeley 235 00:08:18,400 --> 00:08:19,900 to know that we're on the speech and debate team. 236 00:08:20,660 --> 00:08:22,050 So if that's the case, then 237 00:08:22,210 --> 00:08:24,560 what multivalued dependencies do we have in this relation? 238 00:08:25,410 --> 00:08:26,210 We actually have none. 239 00:08:27,130 --> 00:08:29,590 And we don't have any functional dependencies either by the way. 240 00:08:29,790 --> 00:08:31,320 And is this a good design? 241 00:08:31,990 --> 00:08:33,160 Well, actually I would argue yes. 242 00:08:33,640 --> 00:08:35,060 In this case, this design 243 00:08:35,540 --> 00:08:36,680 is a good one because 244 00:08:36,990 --> 00:08:37,860 we're not going to have that 245 00:08:38,020 --> 00:08:39,450 multiplicative effect of information. 246 00:08:40,220 --> 00:08:41,190 Every tuple that we have 247 00:08:41,770 --> 00:08:43,290 in the applied relation will 248 00:08:43,560 --> 00:08:45,180 be an independent piece of important information. 249 00:08:46,560 --> 00:08:47,340 Let's look at one more example 250 00:08:48,280 --> 00:08:50,470 before we go on to talk about properties of multivalued dependencies. 251 00:08:51,640 --> 00:08:52,800 I've extended the apply relation 252 00:08:53,240 --> 00:08:54,520 now to not only include colleges 253 00:08:55,050 --> 00:08:56,530 and hobbies but also the 254 00:08:56,620 --> 00:08:57,940 date of application to a 255 00:08:58,160 --> 00:09:00,390 college, and the major or majors that are being applied for. 256 00:09:01,050 --> 00:09:03,920 Let's continue to assume that hobbies are revealed to college selectively. 257 00:09:04,810 --> 00:09:05,820 We don't need to have same 258 00:09:06,450 --> 00:09:08,800 hobbies for each college that a student applies to. 259 00:09:10,440 --> 00:09:12,050 Secondly, lets assume that 260 00:09:12,340 --> 00:09:13,560 we restrict students to apply 261 00:09:13,730 --> 00:09:14,740 only once to each college, 262 00:09:15,140 --> 00:09:15,760 but what I what we mean 263 00:09:15,820 --> 00:09:17,340 by that is just on one day. 264 00:09:17,730 --> 00:09:19,100 A student can still apply 265 00:09:19,500 --> 00:09:21,150 to multiple majors at a 266 00:09:21,280 --> 00:09:23,050 single college and to different majors at different colleges. 267 00:09:24,220 --> 00:09:25,340 Let's also assume that majors 268 00:09:26,020 --> 00:09:28,170 are independent of hobbies, which seems to make sense. 269 00:09:29,770 --> 00:09:30,840 It takes some thinking to come 270 00:09:30,970 --> 00:09:32,380 up with the right functional and 271 00:09:32,540 --> 00:09:33,840 multivalued dependencies to capture these 272 00:09:34,050 --> 00:09:35,690 constraints, but here they are. 273 00:09:36,400 --> 00:09:37,500 The first one when we 274 00:09:37,640 --> 00:09:38,870 say that we reveal hobbies to 275 00:09:38,970 --> 00:09:40,180 college selectively is actually 276 00:09:40,710 --> 00:09:41,820 the absence of a multivalued 277 00:09:42,100 --> 00:09:44,570 dependency on hobbies and colleges. 278 00:09:45,630 --> 00:09:46,560 The second one says as 279 00:09:46,660 --> 00:09:47,710 we apply once to each 280 00:09:47,850 --> 00:09:49,270 college, or on one particular 281 00:09:49,800 --> 00:09:51,230 day to each college, so 282 00:09:51,640 --> 00:09:52,560 that would say that when 283 00:09:52,790 --> 00:09:53,990 we have a particular student 284 00:09:54,890 --> 00:09:57,140 and a particular college that always 285 00:09:57,560 --> 00:09:58,950 going to have the same date, 286 00:09:59,380 --> 00:10:00,520 so any two tuples for 287 00:10:00,830 --> 00:10:03,320 a student and college combination will be on the same date. 288 00:10:04,380 --> 00:10:06,200 The last dependency that 289 00:10:06,280 --> 00:10:07,760 we will have involves the independence 290 00:10:08,650 --> 00:10:09,840 of the majors that are 291 00:10:09,900 --> 00:10:11,220 being applied for and the 292 00:10:11,470 --> 00:10:13,110 hobbies that a student has, so 293 00:10:13,340 --> 00:10:14,380 we'll write that as the 294 00:10:14,780 --> 00:10:16,210 multivalue dependency social security 295 00:10:16,660 --> 00:10:18,220 number, plus college name, 296 00:10:18,920 --> 00:10:21,230 plus date, multidetermines major, 297 00:10:22,290 --> 00:10:23,490 and remember what that's saying 298 00:10:24,230 --> 00:10:25,770 is that major, for a 299 00:10:25,850 --> 00:10:27,460 given student, college, and 300 00:10:27,690 --> 00:10:29,510 date the majors that 301 00:10:29,680 --> 00:10:30,860 they apply for are independent 302 00:10:31,640 --> 00:10:32,710 of what we call the rest, 303 00:10:33,520 --> 00:10:34,830 which in this case is the hobbies. 304 00:10:36,170 --> 00:10:37,230 So, you might take some time to 305 00:10:37,310 --> 00:10:38,620 look at the formal definitions of 306 00:10:38,750 --> 00:10:40,930 functional dependencies, multivalue dependencies, and 307 00:10:41,030 --> 00:10:42,010 maybe write out some sample 308 00:10:42,200 --> 00:10:43,820 data to convince yourself that 309 00:10:44,050 --> 00:10:45,550 these are the dependencies that 310 00:10:45,690 --> 00:10:48,440 are capturing the assumptions that we make about the real world. 311 00:10:49,670 --> 00:10:51,070 Like with functional dependencies we 312 00:10:51,200 --> 00:10:52,570 have a notion of trivial dependency 313 00:10:53,180 --> 00:10:54,930 those that always hold we 314 00:10:55,070 --> 00:10:56,540 also have some rule for multi valued dependencies. 315 00:10:57,550 --> 00:10:58,590 The definition for a trivial 316 00:10:59,180 --> 00:11:01,010 multi valued dependency A multi 317 00:11:01,410 --> 00:11:03,060 determines B is in 318 00:11:03,230 --> 00:11:05,530 this case, that either B 319 00:11:05,930 --> 00:11:07,310 is a subset of A, 320 00:11:08,420 --> 00:11:09,700 or A union B 321 00:11:10,130 --> 00:11:12,450 are all attributes, a multi-value 322 00:11:12,900 --> 00:11:15,880 dependency is non-trivial if that's not the case. 323 00:11:16,880 --> 00:11:17,640 So let's take the look at 324 00:11:17,710 --> 00:11:20,040 why these multi-value dependencies are trivial. 325 00:11:20,940 --> 00:11:22,070 So let's start with the first 326 00:11:22,430 --> 00:11:24,040 case where we have our 327 00:11:24,600 --> 00:11:25,670 attributes A and the rest 328 00:11:26,390 --> 00:11:28,010 and then attributes B are a 329 00:11:28,050 --> 00:11:29,180 subset of A so lets 330 00:11:29,440 --> 00:11:30,780 say that these are attributes B. 331 00:11:31,580 --> 00:11:32,960 So what are definition of multi-valued 332 00:11:33,720 --> 00:11:34,710 dependencies says that when we 333 00:11:34,840 --> 00:11:36,940 have the same values for 334 00:11:37,410 --> 00:11:38,900 A in two tuples, so 335 00:11:39,210 --> 00:11:40,660 here A and A, then 336 00:11:40,830 --> 00:11:42,480 we have every combination of the 337 00:11:42,730 --> 00:11:43,560 B values and the rest, 338 00:11:44,200 --> 00:11:45,520 well obviously we do since 339 00:11:45,810 --> 00:11:47,340 the B's are subsets of the 340 00:11:47,500 --> 00:11:48,760 A's here, the B values 341 00:11:49,130 --> 00:11:50,030 are going to be the same as 342 00:11:50,140 --> 00:11:51,530 well and we clearly have every combination. 343 00:11:52,810 --> 00:11:54,930 For the other case of trivial multi-value dependencies. 344 00:11:55,640 --> 00:11:56,830 We have A and B together 345 00:11:57,260 --> 00:11:58,400 being all attributes of the 346 00:11:58,490 --> 00:11:59,810 relation, so in that 347 00:11:59,980 --> 00:12:01,820 case, there's no rest, so 348 00:12:02,040 --> 00:12:03,500 clearly we have every combination 349 00:12:04,700 --> 00:12:06,190 of values of A and 350 00:12:06,350 --> 00:12:08,870 B and rest, because there's no rest to combine with. 351 00:12:09,910 --> 00:12:11,480 Like with functional dependencies there are 352 00:12:11,610 --> 00:12:13,450 a whole bunch of rules that hold for multi-valued dependencies. 353 00:12:14,220 --> 00:12:15,320 We will just talk about three of 354 00:12:15,410 --> 00:12:16,340 them, and the first one is 355 00:12:16,520 --> 00:12:17,710 the most important and interesting, 356 00:12:18,530 --> 00:12:19,540 and that's the rule that says 357 00:12:20,090 --> 00:12:21,190 if we have a functional dependency 358 00:12:21,810 --> 00:12:23,520 from A to B then we 359 00:12:23,790 --> 00:12:25,550 also have a multi-valued dependency 360 00:12:26,170 --> 00:12:27,540 from A to B. And I'm 361 00:12:27,720 --> 00:12:28,920 gonna go ahead and prove that rule 362 00:12:29,220 --> 00:12:30,950 for you again because this is an important one. 363 00:12:32,000 --> 00:12:33,210 I'm going do this proof using a 364 00:12:33,270 --> 00:12:34,920 template for the relation similar to 365 00:12:35,050 --> 00:12:36,890 the what I did with rules for functional dependencies. 366 00:12:37,970 --> 00:12:38,720 So let's say we have our 367 00:12:38,890 --> 00:12:40,230 A attributes, our B attributes, 368 00:12:40,710 --> 00:12:42,350 and our rest, and what 369 00:12:42,520 --> 00:12:43,520 we need to prove, to prove 370 00:12:43,950 --> 00:12:45,560 the multi-value dependencies, is when 371 00:12:46,020 --> 00:12:46,780 there are tuples T and 372 00:12:46,890 --> 00:12:47,910 U with a certain form, 373 00:12:48,490 --> 00:12:51,080 there exists a tuple V of another form. 374 00:12:51,720 --> 00:12:53,730 So let's fill in some values first for the tuples. 375 00:12:54,430 --> 00:12:55,290 So Let's say that we 376 00:12:55,450 --> 00:12:56,590 have A and A here, 377 00:12:56,800 --> 00:12:57,800 that's what we need for the 378 00:12:58,570 --> 00:12:59,540 equality of the A values. 379 00:13:00,310 --> 00:13:02,140 Then we have B1 and R1, 380 00:13:02,470 --> 00:13:04,200 and we have B2 and 381 00:13:05,180 --> 00:13:06,220 R2, and in order 382 00:13:06,840 --> 00:13:08,390 to prove this multi-value dependency, 383 00:13:09,430 --> 00:13:10,390 I need to prove that there 384 00:13:10,540 --> 00:13:11,870 exists a tuple V that has 385 00:13:12,070 --> 00:13:13,830 the same A value that it 386 00:13:13,930 --> 00:13:15,540 has B1 from tuple T 387 00:13:16,040 --> 00:13:17,870 and R2 from Tuple U, 388 00:13:18,670 --> 00:13:19,800 and what I have in order 389 00:13:20,200 --> 00:13:21,530 to prove that is the fact 390 00:13:21,860 --> 00:13:23,300 that we have a functional dependency from 391 00:13:23,540 --> 00:13:24,970 A to B. Because we 392 00:13:25,260 --> 00:13:26,680 have the functional dependencies and because 393 00:13:27,000 --> 00:13:28,380 T and U have the same A value. 394 00:13:29,170 --> 00:13:30,330 What that tells us is 395 00:13:30,460 --> 00:13:32,840 that B1 equals B2 here. 396 00:13:33,820 --> 00:13:34,990 And so if B1 equals B2 397 00:13:35,830 --> 00:13:37,190 then we know that this value 398 00:13:37,520 --> 00:13:39,390 B1 here is equivalent 399 00:13:39,740 --> 00:13:41,080 to B2 and in order 400 00:13:41,490 --> 00:13:42,590 to prove the existence of this 401 00:13:43,020 --> 00:13:44,110 tuple well we have that tuple 402 00:13:44,350 --> 00:13:45,640 here already and we're done. 403 00:13:46,310 --> 00:13:47,450 So you might check that again, 404 00:13:47,690 --> 00:13:48,600 but what that says is 405 00:13:48,730 --> 00:13:49,990 using the knowledge of a 406 00:13:50,250 --> 00:13:51,420 functional dependency we can prove 407 00:13:52,110 --> 00:13:53,540 that we always have a corresponding 408 00:13:53,890 --> 00:13:56,200 multivalued dependency there are 409 00:13:56,270 --> 00:13:57,340 a couple more rules for 410 00:13:57,470 --> 00:13:58,760 multivalued dependencies that you can 411 00:13:58,990 --> 00:14:00,300 prove for yourself if you're so inclined. 412 00:14:01,110 --> 00:14:02,430 The first one is the intersection 413 00:14:02,750 --> 00:14:03,670 rule, it says that if 414 00:14:03,850 --> 00:14:05,340 we have A multi determines 415 00:14:06,410 --> 00:14:07,600 B and A multi determines 416 00:14:08,180 --> 00:14:09,680 C, then we have 417 00:14:10,390 --> 00:14:12,530 A multi determines B 418 00:14:13,080 --> 00:14:14,930 intersects C. The transitive 419 00:14:15,510 --> 00:14:17,670 rule is slightly different than from exact transitivity. 420 00:14:18,550 --> 00:14:19,530 What it says is if 421 00:14:19,780 --> 00:14:21,380 we have A multi determine B, 422 00:14:22,200 --> 00:14:23,630 and we have B multi determines 423 00:14:24,210 --> 00:14:25,670 C then we have 424 00:14:26,570 --> 00:14:28,380 A multi determined not 425 00:14:28,920 --> 00:14:30,350 C exactly but C minus 426 00:14:31,000 --> 00:14:31,960 B. And you might work 427 00:14:32,170 --> 00:14:33,370 some examples because it yourself 428 00:14:33,810 --> 00:14:35,000 why we don't have just 429 00:14:35,320 --> 00:14:36,690 A multi determines B and 430 00:14:36,910 --> 00:14:39,600 to subtract the attributes for B, although it's fairly complicated. 431 00:14:40,300 --> 00:14:41,460 So again these rules can 432 00:14:41,670 --> 00:14:42,490 be proven and there are many 433 00:14:42,910 --> 00:14:43,970 other rules of multivalued dependencies 434 00:14:44,720 --> 00:14:47,140 that you can read about in any of the readings provided on our website. 435 00:14:48,380 --> 00:14:49,530 By the way, regarding rules, let's 436 00:14:50,220 --> 00:14:51,090 come back to the fact 437 00:14:51,220 --> 00:14:53,470 that every functional dependency is a multivalued dependency. 438 00:14:54,060 --> 00:14:55,060 So we can use another Venn diagram. 439 00:14:55,690 --> 00:14:56,820 This is different than our previous one. 440 00:14:57,490 --> 00:14:58,890 We can list all of our 441 00:14:59,030 --> 00:15:00,600 multivalued dependencies here and the 442 00:15:01,070 --> 00:15:02,540 functional dependencies are a 443 00:15:02,820 --> 00:15:04,110 subset of those, so what 444 00:15:04,610 --> 00:15:05,770 that tells us is if we 445 00:15:06,210 --> 00:15:07,090 ever have a rule that applies 446 00:15:07,630 --> 00:15:09,980 for multivalued dependencies here, that 447 00:15:10,160 --> 00:15:11,370 will cover the entire Ven diagram 448 00:15:12,420 --> 00:15:14,950 and so that rule will apply for functional dependencies as well. 449 00:15:15,240 --> 00:15:16,740 So every rule for MVDs is 450 00:15:16,920 --> 00:15:18,520 also a rule for functional dependencies. 451 00:15:19,350 --> 00:15:20,840 On the other hand if we 452 00:15:21,060 --> 00:15:21,980 have a rule that applies 453 00:15:22,400 --> 00:15:24,280 for functional dependencies that rule 454 00:15:24,510 --> 00:15:26,030 does not necessarily have to 455 00:15:26,140 --> 00:15:28,210 apply all multivalued dependencies because 456 00:15:29,020 --> 00:15:31,150 it might be specialized just for this portion of the Venn diagram. 457 00:15:32,110 --> 00:15:34,780 So an example of such a rule is the splitting rule. 458 00:15:35,400 --> 00:15:36,650 The splitting rule is a 459 00:15:36,820 --> 00:15:38,030 rule that applies to functional dependencies, 460 00:15:38,570 --> 00:15:40,730 but does not always apply to multivalued dependencies. 461 00:15:41,390 --> 00:15:44,270 And again you could work an example to convince yourself of that fact. 462 00:15:45,490 --> 00:15:45,490 Woo. 463 00:15:45,580 --> 00:15:47,090 So after all that set up 464 00:15:47,370 --> 00:15:48,630 of multivalue dependencies, we're finally 465 00:15:48,990 --> 00:15:50,430 ready to talk about fourth normal form. 466 00:15:51,050 --> 00:15:52,090 The definition of fourth normal 467 00:15:52,400 --> 00:15:54,980 form looks very similar to the one for Boyce-Codd normal form. 468 00:15:55,410 --> 00:15:56,640 Says we take a relation and 469 00:15:56,740 --> 00:15:57,780 we take now a set of 470 00:15:57,850 --> 00:15:59,220 multivalued dependencies for that 471 00:15:59,580 --> 00:16:00,780 relation and the relation 472 00:16:01,040 --> 00:16:02,630 is in fourth normal form if 473 00:16:03,060 --> 00:16:04,400 every non-trivial 474 00:16:04,920 --> 00:16:06,800 multivalued dependency has on 475 00:16:07,040 --> 00:16:08,650 it's left hand side a key. 476 00:16:09,290 --> 00:16:10,590 Remember for functional dependencies it 477 00:16:10,700 --> 00:16:11,780 looks exactly the same except 478 00:16:12,060 --> 00:16:14,710 we have the functional dependency all here instead of multivalued dependencies. 479 00:16:16,010 --> 00:16:17,150 So, let's see exactly what fourth 480 00:16:17,440 --> 00:16:19,700 normal form telling us and why it's a good thing. 481 00:16:20,020 --> 00:16:21,370 So we have A, B, and 482 00:16:21,540 --> 00:16:23,420 rest as usual and let's 483 00:16:23,840 --> 00:16:26,500 suppose that we have a non trivial multivalued dependency. 484 00:16:27,250 --> 00:16:28,620 So that's telling us that 485 00:16:28,810 --> 00:16:30,140 if we have 2 tuples, T 486 00:16:30,450 --> 00:16:31,890 and U and we'll 487 00:16:32,110 --> 00:16:33,290 put in some values for B 488 00:16:33,580 --> 00:16:35,130 and the rest, then we're going 489 00:16:35,410 --> 00:16:37,560 to have the combination of those, as well. 490 00:16:37,920 --> 00:16:39,410 So, that's kind of the proliferation of 491 00:16:39,600 --> 00:16:40,630 tuples we get when we 492 00:16:40,850 --> 00:16:42,450 squish independent facts in the same relation. 493 00:16:43,470 --> 00:16:44,500 But, if the left 494 00:16:44,880 --> 00:16:46,960 side is a key, so if 495 00:16:47,320 --> 00:16:48,840 the A attributes are 496 00:16:48,990 --> 00:16:49,950 a key here then we won't have 497 00:16:50,340 --> 00:16:51,360 those 2 tuples and will 498 00:16:51,590 --> 00:16:53,120 never have to worry about the proliferation. 499 00:16:54,320 --> 00:16:55,550 Now, remember that I said fourth 500 00:16:55,940 --> 00:16:58,550 normal form implies Boyce-Codd Normal Form. 501 00:16:58,860 --> 00:16:59,670 Or if you prefer it in 502 00:16:59,850 --> 00:17:02,070 Venn diagram format, Fourth 503 00:17:02,380 --> 00:17:04,210 Normal Form is stronger than 504 00:17:04,440 --> 00:17:07,320 Boyce-Codd Normal Form and let's see why that's the case. 505 00:17:07,890 --> 00:17:09,100 If we have a fourth 506 00:17:09,300 --> 00:17:10,000 normal form and we want 507 00:17:10,270 --> 00:17:11,520 to show that we're in Boyce-Codd normal 508 00:17:11,970 --> 00:17:13,110 form, then we have to 509 00:17:13,210 --> 00:17:14,300 show that if we have 510 00:17:14,670 --> 00:17:16,970 a functional dependency then 511 00:17:17,410 --> 00:17:18,760 the left hand side A is a key. 512 00:17:19,190 --> 00:17:21,170 That would tell us we're in Boyce-Codd normal form. 513 00:17:21,750 --> 00:17:22,500 Well, if we have a functional 514 00:17:22,880 --> 00:17:24,010 dependency, we had a rule 515 00:17:24,220 --> 00:17:25,350 that tells us we also have 516 00:17:25,600 --> 00:17:27,510 the multivalued dependency and then 517 00:17:27,750 --> 00:17:30,220 since we're in fourth normal form, we get that A as a key. 518 00:17:30,830 --> 00:17:32,370 So again, fourth normal form 519 00:17:32,730 --> 00:17:34,460 implies Boyce-Codd normal form. 520 00:17:35,090 --> 00:17:35,870 Now let's take a look at 521 00:17:35,950 --> 00:17:38,340 the decomposition of algorithm into fourth normal form. 522 00:17:38,590 --> 00:17:39,980 It's extremely similar to the 523 00:17:40,420 --> 00:17:41,690 BCNF decomposition algorithm. 524 00:17:42,310 --> 00:17:43,230 The input is a relation. 525 00:17:43,850 --> 00:17:44,820 A set of functional dependencies 526 00:17:45,410 --> 00:17:46,760 and multi value dependencies and we 527 00:17:46,850 --> 00:17:47,790 need to separate them because 528 00:17:48,040 --> 00:17:49,860 we use the functional dependencies to find keys. 529 00:17:50,830 --> 00:17:52,040 The output is a decomposition 530 00:17:52,570 --> 00:17:53,840 of R into good relations, 531 00:17:54,830 --> 00:17:56,120 in this case fourth normal form, 532 00:17:56,700 --> 00:17:58,310 and it's a good decomposition in 533 00:17:58,450 --> 00:18:01,060 the sense that reassembling the relations gives you back the original. 534 00:18:02,150 --> 00:18:03,250 As with Boyce-Codd normal form 535 00:18:03,500 --> 00:18:04,930 we start by computing keys using 536 00:18:05,170 --> 00:18:06,810 the functional dependencies, and then 537 00:18:07,080 --> 00:18:08,510 we repeat the decomposition process 538 00:18:08,960 --> 00:18:11,070 until all of our relations are in fourth normal 539 00:18:11,440 --> 00:18:11,440 form. 540 00:18:12,380 --> 00:18:13,550 Just as with functional dependencies 541 00:18:14,450 --> 00:18:15,290 in BCNF, we pick a relation 542 00:18:15,830 --> 00:18:17,760 that has a violating dependency, in 543 00:18:17,840 --> 00:18:19,450 this case a multi-value dependency, and 544 00:18:19,750 --> 00:18:22,140 we split the relation based on that dependency. 545 00:18:22,800 --> 00:18:24,130 So we create one relation that 546 00:18:24,260 --> 00:18:25,170 has the attributes of the dependency 547 00:18:25,890 --> 00:18:27,160 and another relation that has 548 00:18:27,320 --> 00:18:29,620 the left-hand side of the dependency and the rest of the attributes. 549 00:18:30,680 --> 00:18:31,690 After that, we need to 550 00:18:31,760 --> 00:18:33,530 compute the functional dependencies for 551 00:18:33,810 --> 00:18:35,200 the decomposed relation and the 552 00:18:35,740 --> 00:18:38,620 multi-value dependencies for it, and then we can compute the keys. 553 00:18:39,440 --> 00:18:41,240 Now finding these multi-value 554 00:18:41,750 --> 00:18:44,710 dependencies is actually a fairly complex process. 555 00:18:45,480 --> 00:18:46,670 Usually it's very intuitive, 556 00:18:47,450 --> 00:18:48,220 but I'm going to refer you 557 00:18:48,380 --> 00:18:50,410 to the readings to read about the algorithm itself. 558 00:18:50,880 --> 00:18:51,770 And in fact, it can be 559 00:18:51,960 --> 00:18:53,020 so complicated in the general 560 00:18:53,410 --> 00:18:55,620 case that some of the readings don't even provide the algorithm. 561 00:18:56,040 --> 00:18:57,690 But again, in general, it's very intuitive. 562 00:18:59,260 --> 00:19:01,410 Our first example is going to be very fast to do. 563 00:19:01,930 --> 00:19:03,270 As you remember, this example has 564 00:19:03,850 --> 00:19:05,570 one multi-value dependency - social 565 00:19:05,870 --> 00:19:07,460 security number determines college name, 566 00:19:07,660 --> 00:19:09,040 multi determines college name - 567 00:19:09,580 --> 00:19:11,690 and it has no keys other than all of the attributes. 568 00:19:12,400 --> 00:19:13,470 So obviously, this is a 569 00:19:13,500 --> 00:19:15,060 violating multi value dependency, 570 00:19:16,030 --> 00:19:17,880 and so we decompose into two 571 00:19:18,220 --> 00:19:20,300 relations, we'll call them A1 and A2. 572 00:19:21,220 --> 00:19:22,620 The first one has the attributes 573 00:19:23,180 --> 00:19:24,860 of the multivalue dependency, the 574 00:19:24,940 --> 00:19:26,640 social security number and 575 00:19:26,790 --> 00:19:28,010 the college name, and the second 576 00:19:28,400 --> 00:19:29,430 one has the left hand 577 00:19:29,650 --> 00:19:31,640 side multivalued dependency as well 578 00:19:31,910 --> 00:19:34,440 as all the remaining attributes, which in this case is the hobby. 579 00:19:35,290 --> 00:19:36,880 These two decomposed relations actually 580 00:19:37,440 --> 00:19:39,260 have no FDs and no 581 00:19:39,510 --> 00:19:41,100 MVDs so in that 582 00:19:41,340 --> 00:19:42,540 case we're definitely in 4th 583 00:19:42,880 --> 00:19:44,380 normal form and we're done 584 00:19:44,590 --> 00:19:46,240 with the decomposition and I think 585 00:19:46,510 --> 00:19:47,410 we can agree that this looks like 586 00:19:47,610 --> 00:19:49,040 a good schema for the data at hand. 587 00:19:49,980 --> 00:19:51,600 Our second example is quite a bit more complicated. 588 00:19:52,670 --> 00:19:54,010 Remember in this example we 589 00:19:54,260 --> 00:19:55,300 have that the social 590 00:19:55,710 --> 00:19:57,090 security number and college name 591 00:19:57,770 --> 00:19:58,940 functionally determine date. 592 00:19:59,650 --> 00:20:00,890 That means we have each student 593 00:20:01,150 --> 00:20:03,100 applies to each college on a specific date. 594 00:20:03,420 --> 00:20:05,160 And secondly, we assume that 595 00:20:05,530 --> 00:20:07,610 majors that were being applied for were independent of hobbies. 596 00:20:08,450 --> 00:20:10,310 So we have social security number, 597 00:20:10,660 --> 00:20:12,540 college name and date 598 00:20:13,480 --> 00:20:14,760 multi determines the major. 599 00:20:15,750 --> 00:20:18,660 And incidentally that would mean it multi determines the hobby too. 600 00:20:19,870 --> 00:20:21,170 Once again, we have no keys 601 00:20:21,610 --> 00:20:23,640 for the relation, except for all attributes. 602 00:20:24,420 --> 00:20:26,020 So we have both a violating 603 00:20:26,890 --> 00:20:28,280 functional dependency in this case 604 00:20:28,390 --> 00:20:30,780 and we have a violating multivalue dependency. 605 00:20:31,660 --> 00:20:33,380 Let's use the multivalue dependency for 606 00:20:33,560 --> 00:20:34,990 our first decomposition step. 607 00:20:35,970 --> 00:20:38,180 So we'll create A1 and A2. 608 00:20:38,610 --> 00:20:41,580 A1 will contain all the attributes of our multivalued dependency. 609 00:20:43,280 --> 00:20:44,760 And then A2 will contain 610 00:20:45,500 --> 00:20:47,710 all the remaining attributes along with 611 00:20:47,770 --> 00:20:49,910 the left hand side of our multivalue dependency. 612 00:20:51,760 --> 00:20:54,240 And that turns out to be all of the attributes except the major. 613 00:20:55,280 --> 00:20:56,090 Now let's look at our 614 00:20:56,230 --> 00:20:57,650 decomposed relations and see 615 00:20:57,810 --> 00:20:58,620 what we have in terms of 616 00:20:58,690 --> 00:21:00,990 functional dependencies and multi-value dependencies for them. 617 00:21:02,080 --> 00:21:02,990 So after the decomposition, we don't 618 00:21:03,220 --> 00:21:05,490 have any more multivalued dependency but 619 00:21:05,650 --> 00:21:07,400 our functional dependency actually applies 620 00:21:07,710 --> 00:21:08,490 to both of the decomposed 621 00:21:09,130 --> 00:21:10,570 relations and we still 622 00:21:11,120 --> 00:21:12,890 don't have a key on the left hand side. 623 00:21:13,520 --> 00:21:16,350 So we need to decompose further based on the first functional dependency. 624 00:21:17,280 --> 00:21:18,750 So let's start by decomposing A1. 625 00:21:19,980 --> 00:21:21,860 We'll turn A1 into A3 626 00:21:22,110 --> 00:21:24,230 and A4, and A3 will have 627 00:21:24,670 --> 00:21:27,310 the functional dependency, all three attributes. 628 00:21:28,050 --> 00:21:29,410 And then A4 will have the 629 00:21:29,510 --> 00:21:30,550 left side of the functional 630 00:21:30,770 --> 00:21:32,220 dependency and the remaining 631 00:21:32,910 --> 00:21:34,490 attributes, which in this case is the major. 632 00:21:35,870 --> 00:21:37,020 So now we're finished with A1 633 00:21:37,220 --> 00:21:39,060 and we have a similar problem with A2. 634 00:21:39,320 --> 00:21:40,970 And so we decompose A2 635 00:21:41,270 --> 00:21:43,940 similarly, although we'll discover 636 00:21:43,990 --> 00:21:45,500 that A3 is the same relation in 637 00:21:45,790 --> 00:21:48,050 the decomposition of A2 as we got with A1. 638 00:21:48,180 --> 00:21:49,770 So we actually only add one 639 00:21:50,300 --> 00:21:51,540 more relation now, which is A5. 640 00:21:52,530 --> 00:21:53,950 That contains the social 641 00:21:54,330 --> 00:21:55,130 security number and the college 642 00:21:55,540 --> 00:21:56,470 name from the left side of 643 00:21:56,540 --> 00:21:57,950 the functional dependency and the 644 00:21:58,300 --> 00:21:59,700 hobby, which is the remaining attribute. 645 00:22:00,270 --> 00:22:01,320 And then we cross out A2. 646 00:22:04,890 --> 00:22:05,810 Now the only functional dependencies are multi-value dependencies we have 647 00:22:06,040 --> 00:22:08,170 left do have a key on the left-hand side. 648 00:22:08,400 --> 00:22:09,750 I'll let you verify that for yourself. 649 00:22:10,530 --> 00:22:11,700 And these three relations are our 650 00:22:11,890 --> 00:22:14,090 final decomposition into 4th normal form. 651 00:22:14,340 --> 00:22:15,380 And I think you will agree 652 00:22:15,590 --> 00:22:16,720 that this is a good design 653 00:22:17,200 --> 00:22:18,220 again for the data at hand. 654 00:22:19,570 --> 00:22:20,570 So let's wrap up this long 655 00:22:21,040 --> 00:22:23,710 unit on on dependencies and normal forms with a quick summary. 656 00:22:24,630 --> 00:22:26,750 If we have a relation RABC, a 657 00:22:27,330 --> 00:22:28,880 functional dependency from A 658 00:22:29,260 --> 00:22:30,470 to B tells us that 659 00:22:30,600 --> 00:22:31,870 when we have the same A values, 660 00:22:32,310 --> 00:22:33,410 we have the same B values, 661 00:22:34,600 --> 00:22:36,020 and the Boyce-Codd normal form tells 662 00:22:36,570 --> 00:22:38,220 us to factor that...those attributes 663 00:22:38,630 --> 00:22:40,240 into their own relation so that 664 00:22:40,350 --> 00:22:42,770 we don't repeat that relationship over and over. 665 00:22:43,770 --> 00:22:45,510 For multi-value dependencies, let's say 666 00:22:45,720 --> 00:22:47,830 we have the relation RABCD, 667 00:22:47,980 --> 00:22:49,320 and if we have 668 00:22:49,560 --> 00:22:51,300 the multi-value dependency A multi 669 00:22:51,640 --> 00:22:53,270 determines B, what that tells 670 00:22:53,780 --> 00:22:54,940 us is that we have every combination 671 00:22:55,780 --> 00:22:57,160 for a given A of B 672 00:22:57,410 --> 00:22:58,820 values and CD values 673 00:22:59,310 --> 00:23:01,060 - we called those rest earlier - 674 00:23:01,440 --> 00:23:03,040 and when we have that multiplicative effect 675 00:23:03,390 --> 00:23:05,100 of combinations, again, we take 676 00:23:05,430 --> 00:23:07,010 the A and B attributes and 677 00:23:07,070 --> 00:23:08,130 we put them in a separate 678 00:23:08,300 --> 00:23:09,140 relation so that we can 679 00:23:09,330 --> 00:23:10,560 separate out those facts from 680 00:23:10,620 --> 00:23:13,510 the independent fact of A and its CD values. 681 00:23:15,070 --> 00:23:16,050 Finally, in the design process, 682 00:23:17,000 --> 00:23:18,640 multi-value dependencies are something 683 00:23:19,010 --> 00:23:21,050 we add to functional dependencies, only they're stronger. 684 00:23:22,240 --> 00:23:23,610 So fourth normal form is 685 00:23:23,880 --> 00:23:26,650 a stronger property than Boyce-Codd normal form. 686 00:23:27,900 --> 00:23:29,110 Now usually this design process 687 00:23:29,500 --> 00:23:30,770 works very well and is 688 00:23:30,920 --> 00:23:33,740 very intuitive for many schemas, I hope for the examples that I gave here. 689 00:23:34,190 --> 00:23:35,050 But there are actually a few 690 00:23:35,300 --> 00:23:36,850 shortcomings to using Boyce-Codd 691 00:23:37,350 --> 00:23:38,520 Normal Form or Fourth Normal Form 692 00:23:38,760 --> 00:23:40,420 and we'll cover those in the next video.