1 00:00:00,570 --> 00:00:01,910 The last several videos talked about 2 00:00:02,260 --> 00:00:03,620 Boyce-Codd Normal form and Fourth 3 00:00:03,970 --> 00:00:06,300 Normal form as describing good relations. 4 00:00:07,330 --> 00:00:09,030 As a remainder Boyce-Codd normal form 5 00:00:09,330 --> 00:00:11,010 is based on functional dependencies and 6 00:00:11,270 --> 00:00:12,430 it is says whenever we have 7 00:00:12,620 --> 00:00:13,980 functional dependency on a relation 8 00:00:14,560 --> 00:00:15,860 the left hand side needs to 9 00:00:15,970 --> 00:00:17,420 be a key or contain a key. 10 00:00:18,220 --> 00:00:19,620 Fourth normal form uses multi 11 00:00:20,150 --> 00:00:21,600 value dependencies and says 12 00:00:21,870 --> 00:00:23,240 whenever we have a 13 00:00:23,280 --> 00:00:24,810 non-trivial multi value dependency, 14 00:00:25,750 --> 00:00:27,190 again the left-hand side is a key. 15 00:00:27,780 --> 00:00:29,450 And as a reminder, functional dependencies 16 00:00:30,180 --> 00:00:31,400 are also multi value dependencies, 17 00:00:32,300 --> 00:00:35,100 so Fourth normal form implies Boyce-Codd normal form. 18 00:00:36,640 --> 00:00:37,610 What we can see in this video is a few 19 00:00:37,910 --> 00:00:39,430 examples where Boyce-Codd Normal 20 00:00:39,790 --> 00:00:41,310 form or Fourth normal are actually 21 00:00:41,580 --> 00:00:45,320 not necessarily desirable for the specific relations we are designing. 22 00:00:46,700 --> 00:00:47,890 As usual we'll be using our 23 00:00:48,180 --> 00:00:50,260 college application information database, although 24 00:00:50,490 --> 00:00:51,980 each of our examples will be a bit different. 25 00:00:52,800 --> 00:00:55,390 For our first example let's take a look at the relation. 26 00:00:55,990 --> 00:00:56,990 And let's suppose that we 27 00:00:57,110 --> 00:00:58,540 have the student's Social Security number, 28 00:00:58,920 --> 00:00:59,780 the college that they are applying 29 00:01:00,230 --> 00:01:00,940 to, the date of the their application 30 00:01:01,590 --> 00:01:03,030 and the major they are applying for. 31 00:01:03,760 --> 00:01:05,010 And further more, let's say that 32 00:01:05,390 --> 00:01:06,740 each student can apply to each 33 00:01:06,900 --> 00:01:08,620 college only once and for one major. 34 00:01:09,180 --> 00:01:10,600 So that would be on one date for major. 35 00:01:11,830 --> 00:01:13,150 Furthermore, let's suppose that all 36 00:01:13,430 --> 00:01:14,620 of the colleges in our database 37 00:01:15,060 --> 00:01:17,020 have non-overlapping application dates. 38 00:01:17,580 --> 00:01:19,630 And this is obviously contrived for 39 00:01:19,840 --> 00:01:21,700 the specific example that we want to demonstrate. 40 00:01:22,870 --> 00:01:24,710 So under these assumptions, 41 00:01:24,860 --> 00:01:26,760 the functional dependencies for our 42 00:01:26,920 --> 00:01:28,400 relation will be that 43 00:01:28,640 --> 00:01:30,080 the social security number and 44 00:01:30,240 --> 00:01:33,300 the college name determine the date and the major. 45 00:01:34,090 --> 00:01:35,920 And so that's based on our first assumption. 46 00:01:37,050 --> 00:01:38,080 And we have one more functional dependency, 47 00:01:38,970 --> 00:01:40,920 which is that the date determines the 48 00:01:41,030 --> 00:01:42,420 college name and that's based 49 00:01:42,810 --> 00:01:44,000 on our second assumption. 50 00:01:44,800 --> 00:01:45,970 So if we have two tuples 51 00:01:46,350 --> 00:01:47,300 with the same date, they have to 52 00:01:47,410 --> 00:01:48,770 be the same college because colleges 53 00:01:49,220 --> 00:01:50,330 have non-overlapping dates. 54 00:01:51,310 --> 00:01:52,840 So using these functional dependencies 55 00:01:53,620 --> 00:01:54,580 we can determine that the key 56 00:01:55,050 --> 00:01:56,500 for the relation is Social Security 57 00:01:56,980 --> 00:01:58,070 number and college name together, 58 00:01:58,830 --> 00:02:00,620 since they determine the other two attributes. 59 00:02:01,600 --> 00:02:03,640 So is this relation in Boyce-Codd normal form? 60 00:02:03,900 --> 00:02:05,390 The answer is no, and the 61 00:02:05,450 --> 00:02:06,550 reason is the second functional 62 00:02:06,960 --> 00:02:09,300 dependency which does not have key on its left hand side. 63 00:02:10,140 --> 00:02:11,130 So let's follow the decomposition 64 00:02:11,880 --> 00:02:13,530 process and create two relations 65 00:02:14,320 --> 00:02:15,160 from our big relation. 66 00:02:16,070 --> 00:02:17,490 The first one will have the 67 00:02:18,070 --> 00:02:20,040 attributes from the violating functional dependency. 68 00:02:21,130 --> 00:02:22,020 And the second one will have 69 00:02:22,170 --> 00:02:25,210 all the remaining attributes plus the left hand side of the functional dependency. 70 00:02:26,130 --> 00:02:29,110 So that would be the social security number, the date, and the major. 71 00:02:30,320 --> 00:02:31,710 So did we think this is a good design? 72 00:02:33,270 --> 00:02:34,370 It certainly is in 73 00:02:34,440 --> 00:02:35,400 Boyce-Codd normal form but I would 74 00:02:35,610 --> 00:02:37,190 argue it is not really a good design. 75 00:02:38,280 --> 00:02:40,470 We do correctly factor out 76 00:02:40,750 --> 00:02:43,650 the information connecting the application dates and the colleges. 77 00:02:44,330 --> 00:02:45,650 On the other hand, we've now 78 00:02:46,060 --> 00:02:47,890 separated a student's application, 79 00:02:48,880 --> 00:02:49,820 the date, and the major of 80 00:02:49,890 --> 00:02:51,190 their application from the college 81 00:02:51,660 --> 00:02:53,050 they're applying to, so even 82 00:02:53,280 --> 00:02:54,570 to check this first functional 83 00:02:54,980 --> 00:02:56,230 dependency here, would require 84 00:02:56,800 --> 00:02:58,640 us to join these two relations together. 85 00:02:59,800 --> 00:03:00,880 Intuitively we might just prefer 86 00:03:01,330 --> 00:03:03,110 to keep that information together, even 87 00:03:03,400 --> 00:03:06,040 if we are capturing some information multiple times. 88 00:03:06,710 --> 00:03:09,550 So, I'd say not necessarily good design here. 89 00:03:10,110 --> 00:03:11,800 And there is another normal form, 90 00:03:12,140 --> 00:03:13,780 it's called third normal form 91 00:03:14,030 --> 00:03:14,970 and I'm not going to define 92 00:03:15,490 --> 00:03:16,730 it here, but third normal 93 00:03:17,040 --> 00:03:18,970 form actually allows this relation 94 00:03:19,460 --> 00:03:20,550 here to stay as is 95 00:03:20,730 --> 00:03:21,990 without being decomposed and you 96 00:03:22,160 --> 00:03:23,310 can read about third normal form 97 00:03:23,560 --> 00:03:24,890 in any of the readings mentioned on the website. 98 00:03:26,260 --> 00:03:27,440 Just to put third normal form 99 00:03:27,700 --> 00:03:30,340 in context, we'll go back to our Venn diagram. 100 00:03:30,860 --> 00:03:32,370 Remember that fourth normal form 101 00:03:32,690 --> 00:03:34,920 is stronger than Boyce-Codd Normal Form. 102 00:03:35,120 --> 00:03:36,250 So everything that's in fourth 103 00:03:36,500 --> 00:03:37,540 normal form is also in Boyce-Codd 104 00:03:38,000 --> 00:03:39,450 Normal Form, and third normal 105 00:03:39,770 --> 00:03:42,510 form is even weaker than Boyce- Codd Normal Form. 106 00:03:43,090 --> 00:03:44,570 So if we have a 107 00:03:44,810 --> 00:03:45,950 relation that's in third normal 108 00:03:46,380 --> 00:03:48,270 form it's not necessarily in 109 00:03:48,360 --> 00:03:49,380 Boyce-Codd Normal Form and here's an 110 00:03:49,460 --> 00:03:50,550 example of one that's not, 111 00:03:51,160 --> 00:03:52,120 but if we have something in 112 00:03:52,230 --> 00:03:54,130 Boyce-Codd Normal Form it is in third normal form. 113 00:03:54,340 --> 00:03:57,210 And again, you can read about third normal form on the readings from the website. 114 00:03:58,330 --> 00:04:00,440 For our second example lets go back to our student relation. 115 00:04:00,750 --> 00:04:01,400 And now we're going to have a 116 00:04:01,440 --> 00:04:02,830 student's Social Security number, the 117 00:04:03,020 --> 00:04:03,950 high schools that they've attended, 118 00:04:04,380 --> 00:04:06,640 their GPA and their priority for admission. 119 00:04:07,600 --> 00:04:08,870 And let's assume that students might 120 00:04:09,240 --> 00:04:10,540 attend multiple high schools, and 121 00:04:10,830 --> 00:04:12,770 that the priority is determined from the GPA. 122 00:04:13,770 --> 00:04:15,110 So in that case our functional 123 00:04:15,480 --> 00:04:17,410 dependencies in this case are 124 00:04:17,620 --> 00:04:19,100 social security number determines GPA. 125 00:04:19,560 --> 00:04:21,460 We're going to assume that students have one GPA. 126 00:04:22,240 --> 00:04:23,830 We'll have GPA determines priority 127 00:04:25,360 --> 00:04:26,200 and by the way we also 128 00:04:26,540 --> 00:04:27,660 have using the transitivity rule, 129 00:04:28,220 --> 00:04:29,900 social security number determines priority. 130 00:04:31,400 --> 00:04:32,430 The key for this 131 00:04:32,660 --> 00:04:33,770 relation is the combination 132 00:04:34,740 --> 00:04:37,260 of social security number and high school name. 133 00:04:37,480 --> 00:04:38,420 Because students could go to 134 00:04:38,480 --> 00:04:39,860 multiple high schools, but once 135 00:04:40,190 --> 00:04:41,930 we have the social security number, we 136 00:04:42,080 --> 00:04:43,830 do functionally determine through these 137 00:04:44,080 --> 00:04:45,900 functional dependencies, the remaining 138 00:04:46,610 --> 00:04:47,870 attributes, the priority and the GPA. 139 00:04:49,170 --> 00:04:50,280 So is this relation in BCNF? 140 00:04:51,030 --> 00:04:51,910 No, it's clearly not. 141 00:04:52,740 --> 00:04:53,930 And in fact, all three of 142 00:04:54,020 --> 00:04:55,940 our functional dependencies violate BCNF. 143 00:04:56,760 --> 00:04:57,830 So let's go ahead and do the decomposition. 144 00:04:58,850 --> 00:05:00,620 So we'll decompose into S1 and S2. 145 00:05:00,960 --> 00:05:03,320 Let's start with the social security number determines priority. 146 00:05:04,320 --> 00:05:07,060 So then we'll have social security number and priority in S1. 147 00:05:07,270 --> 00:05:09,260 And in the other one, we'll have 148 00:05:09,530 --> 00:05:11,100 all the remaining attributes and the 149 00:05:11,180 --> 00:05:12,320 left hand side - high 150 00:05:12,480 --> 00:05:14,480 school name and that would be the GPA. 151 00:05:15,890 --> 00:05:17,160 Then the second one still 152 00:05:17,410 --> 00:05:18,950 needs to be decomposed further, so 153 00:05:19,200 --> 00:05:21,620 that one will become S3 and S4. 154 00:05:22,750 --> 00:05:23,820 S3 will have the social 155 00:05:24,180 --> 00:05:25,350 security number and the GPA 156 00:05:25,710 --> 00:05:27,950 based on the first 157 00:05:28,150 --> 00:05:29,490 functional dependency and then 158 00:05:29,690 --> 00:05:30,780 S4 will have the social 159 00:05:31,100 --> 00:05:32,740 security number and the high school name. 160 00:05:33,010 --> 00:05:33,590 And then we take away S2. 161 00:05:33,850 --> 00:05:34,230 Now, do we think this is a good design? 162 00:05:34,280 --> 00:05:34,510 Again, I would say not necessarily. 163 00:05:34,560 --> 00:05:35,810 It doesn't quite feel right; for one thing we've lost the GPA to priority functional dependency. 164 00:05:49,720 --> 00:05:50,760 We can join the two relations 165 00:05:51,270 --> 00:05:52,660 S1 and S3 to check that 166 00:05:52,840 --> 00:05:54,130 dependency, but it would 167 00:05:54,500 --> 00:05:55,690 be nice if that dependency were 168 00:05:55,800 --> 00:05:56,960 together in a relation and 169 00:05:57,140 --> 00:05:58,330 in fact, in this case 170 00:05:58,860 --> 00:06:00,220 we can have a Boyce-Codd normal 171 00:06:00,680 --> 00:06:02,740 form relation that does maintain that dependency. 172 00:06:03,610 --> 00:06:04,850 So if we had made S1 173 00:06:05,140 --> 00:06:07,770 have GPA and priority 174 00:06:08,100 --> 00:06:09,680 together, then we wouldn't 175 00:06:09,870 --> 00:06:11,020 have needed S3 and we're actually 176 00:06:11,530 --> 00:06:12,770 still in Boyce-Codd normal form. 177 00:06:13,270 --> 00:06:14,140 So that would would have 178 00:06:14,300 --> 00:06:15,370 happened if we had started 179 00:06:15,900 --> 00:06:17,540 with the functional dependency which is 180 00:06:17,690 --> 00:06:19,050 also implied, that was 181 00:06:19,310 --> 00:06:21,500 SSN to GPA and priority. 182 00:06:22,290 --> 00:06:24,410 Actually way back in the decomposition algorithm. 183 00:06:24,870 --> 00:06:26,770 I mentioned that sometimes when people 184 00:06:27,250 --> 00:06:28,420 use that algorithm they try 185 00:06:28,600 --> 00:06:30,140 to start with the biggest functional 186 00:06:30,470 --> 00:06:32,970 dependencies they can to get a better end design. 187 00:06:33,810 --> 00:06:34,710 And this is a case where we 188 00:06:34,850 --> 00:06:36,490 would want to do that if 189 00:06:36,630 --> 00:06:38,920 we prefer to end up with a design with just these two relations. 190 00:06:39,380 --> 00:06:40,710 So overall with the first two relations that illustrated 191 00:06:40,840 --> 00:06:44,710 when you have Boyce-Codd Normal form 192 00:06:44,910 --> 00:06:46,400 and Fourth Normal form it's possible 193 00:06:46,990 --> 00:06:48,870 that after the decomposition there's no 194 00:06:49,230 --> 00:06:50,210 guarantee that all of our 195 00:06:50,370 --> 00:06:52,770 original dependencies can be checked on the decomposed relations. 196 00:06:53,520 --> 00:06:56,350 They may require joins of those relations in order to check them. 197 00:06:57,120 --> 00:06:58,810 Now, lets look at a different type of example. 198 00:06:59,890 --> 00:07:00,750 Let's suppose that we have 199 00:07:00,960 --> 00:07:02,260 a relation that contains the scores 200 00:07:02,860 --> 00:07:03,870 of students who are applying to 201 00:07:04,130 --> 00:07:05,170 college, so we'll have the 202 00:07:05,230 --> 00:07:06,970 student's Social Security numbers for and their name. 203 00:07:07,700 --> 00:07:08,990 And then SAT scores and 204 00:07:09,500 --> 00:07:10,250 ACT scores and they might 205 00:07:10,630 --> 00:07:12,040 have many SAT's and ACT's. 206 00:07:13,500 --> 00:07:15,200 So the only functional dependency we'd 207 00:07:15,330 --> 00:07:16,840 have in that case is 208 00:07:16,920 --> 00:07:18,150 the one that says social security 209 00:07:18,660 --> 00:07:20,120 number determines the student 210 00:07:20,570 --> 00:07:21,940 name, and as far 211 00:07:22,140 --> 00:07:23,720 as keys go there's no key. 212 00:07:24,550 --> 00:07:24,890 Now in this example we do have a multi-valued dependency. 213 00:07:26,000 --> 00:07:27,710 If a student has several 214 00:07:28,220 --> 00:07:29,780 SAT and ACT scores we 215 00:07:29,850 --> 00:07:31,430 can assume they're independent of each other. 216 00:07:32,160 --> 00:07:33,260 And so that's captured in the 217 00:07:33,380 --> 00:07:34,690 multi-valued dependency that says 218 00:07:35,550 --> 00:07:36,760 for a given student, identified 219 00:07:37,480 --> 00:07:38,840 by their social security number and 220 00:07:38,950 --> 00:07:41,060 name, if they 221 00:07:41,180 --> 00:07:42,420 have a set of SAT scores, 222 00:07:43,350 --> 00:07:46,210 those SAT scores are independent of the rest of the attributes. 223 00:07:46,600 --> 00:07:47,750 You remember that from the 224 00:07:47,910 --> 00:07:49,270 MVD video, which in 225 00:07:49,390 --> 00:07:51,230 this case would be ACT. 226 00:07:51,820 --> 00:07:53,130 In that case, we'll have for each student every 227 00:07:53,500 --> 00:07:55,690 combination of their SAT and ACT scores. 228 00:07:56,640 --> 00:07:58,520 So is this relations in fourth normal form? 229 00:07:58,900 --> 00:08:00,340 Well, clearly it's not; we have 230 00:08:00,710 --> 00:08:02,520 a violating multi-value dependency here 231 00:08:02,670 --> 00:08:03,500 since it doesn't have a key 232 00:08:03,600 --> 00:08:06,010 on the left hand side and we also have a violating functional dependency. 233 00:08:07,050 --> 00:08:08,100 So let's quickly do the decomposition. 234 00:08:09,080 --> 00:08:10,110 We'll start with the multi-value dependency 235 00:08:11,080 --> 00:08:12,140 so that would give us in 236 00:08:12,400 --> 00:08:13,320 the first relation the attributes of 237 00:08:13,350 --> 00:08:16,070 the multi-value dependency and 238 00:08:16,210 --> 00:08:17,670 then it will give us the attributes 239 00:08:18,590 --> 00:08:20,750 that are remaining plus the ones on the left hand side. 240 00:08:21,070 --> 00:08:23,410 So social security number, student name, and ACT. 241 00:08:23,640 --> 00:08:26,050 So now we're done with the multivalue dependency. 242 00:08:26,800 --> 00:08:27,960 Actually in this case because it's now 243 00:08:28,260 --> 00:08:30,110 a trivial multivalue dependency when 244 00:08:30,270 --> 00:08:32,670 it covers all attributes, and 245 00:08:32,900 --> 00:08:33,670 now let's take a look at 246 00:08:33,730 --> 00:08:35,140 the functional dependency that's still 247 00:08:35,470 --> 00:08:36,520 a violation because the social 248 00:08:36,830 --> 00:08:38,180 security number is not a 249 00:08:38,360 --> 00:08:40,460 key on the left hand side for either of these relations. 250 00:08:41,460 --> 00:08:43,550 So, we'll split S1 into 251 00:08:44,180 --> 00:08:45,130 S3 and S4. 252 00:08:45,720 --> 00:08:46,810 S1 will continue the Social 253 00:08:46,990 --> 00:08:48,580 Security number and student name, 254 00:08:49,490 --> 00:08:50,950 and S4 will contain the Social 255 00:08:51,270 --> 00:08:52,560 Security number and the SAT. 256 00:08:53,330 --> 00:08:55,770 And actually similarly need to split S2. 257 00:08:55,990 --> 00:08:57,960 And that will give S5, 258 00:08:58,890 --> 00:09:00,300 which contains the Social Security 259 00:09:00,820 --> 00:09:03,040 number and the ACT score. 260 00:09:03,930 --> 00:09:05,870 And so now regard of S1 and S2. 261 00:09:06,210 --> 00:09:07,780 An just in case you're getting 262 00:09:07,960 --> 00:09:09,740 confused at this point this is our final schema. And 263 00:09:10,790 --> 00:09:13,590 this schema is in fourth normal form. 264 00:09:14,170 --> 00:09:17,020 And that's good, we've separated out all the relevant facts. 265 00:09:17,790 --> 00:09:19,700 So actually it feels like a pretty good schema. 266 00:09:20,770 --> 00:09:22,070 But let's suppose now that all 267 00:09:22,350 --> 00:09:23,970 of the the queries that we're going to run on these relations. 268 00:09:24,620 --> 00:09:26,680 Take a social security number and 269 00:09:26,900 --> 00:09:27,930 what they return is the student's 270 00:09:28,350 --> 00:09:29,800 name and some composite score 271 00:09:30,080 --> 00:09:31,560 that's based on all of their 272 00:09:31,830 --> 00:09:32,900 SAT and ACT scores and 273 00:09:33,080 --> 00:09:35,050 again let's suppose every single query does that. 274 00:09:35,780 --> 00:09:37,120 If that's the case, then every 275 00:09:37,490 --> 00:09:38,730 query is going to have to-recombine 276 00:09:40,130 --> 00:09:42,690 all of these three relations in order to compute it's results. 277 00:09:43,580 --> 00:09:44,590 So, we might have been better 278 00:09:44,860 --> 00:09:45,600 off, even if it included 279 00:09:46,190 --> 00:09:47,480 some redundancy and some 280 00:09:47,660 --> 00:09:49,320 additional storage with our 281 00:09:50,060 --> 00:09:52,430 original design because each query again is going to access all of the data in that one relation. 282 00:09:52,440 --> 00:09:52,470 So, there is something called a de-normalize relation. 283 00:10:02,100 --> 00:10:03,220 And when queries tend to 284 00:10:03,510 --> 00:10:04,840 access all attributes, when 285 00:10:05,130 --> 00:10:06,060 every query would be reassembling 286 00:10:06,780 --> 00:10:08,250 the normalized relations, it actually 287 00:10:08,560 --> 00:10:11,570 might be preferable to use one that is quote "de-normalized". 288 00:10:13,080 --> 00:10:14,070 And here's our final example. 289 00:10:14,590 --> 00:10:16,010 This time let's talk about college information. 290 00:10:16,640 --> 00:10:18,300 So, we have a college and the state it's in. 291 00:10:19,220 --> 00:10:19,870 Let's suppose we have some other information 292 00:10:20,400 --> 00:10:21,160 and we decide to put it in 293 00:10:21,270 --> 00:10:22,580 separate relations We have 294 00:10:22,810 --> 00:10:23,880 the size of the college, the 295 00:10:23,940 --> 00:10:24,900 number of students in the 296 00:10:24,970 --> 00:10:26,080 college, and we might 297 00:10:26,320 --> 00:10:27,890 have the average SAT score for 298 00:10:28,040 --> 00:10:29,620 each college, the average GPA for 299 00:10:29,720 --> 00:10:31,190 each college, and maybe some 300 00:10:31,330 --> 00:10:33,990 additional relations, each containing one fact about the college. 301 00:10:34,800 --> 00:10:36,090 So all of these relations in 302 00:10:36,240 --> 00:10:37,590 Boyce-Codd Normal form and First 303 00:10:37,690 --> 00:10:39,130 Normal form, yes they are, 304 00:10:39,450 --> 00:10:40,540 actually for each relation we 305 00:10:40,700 --> 00:10:42,130 have a functional dependency from the 306 00:10:42,190 --> 00:10:44,080 college name to the other attribute but the left side will be a key so were in a good shape. 307 00:10:50,310 --> 00:10:51,100 Is this a good design? once again I say not necessarily, we 308 00:10:51,260 --> 00:10:53,840 may not want to decompose so much. 309 00:10:54,180 --> 00:10:55,080 I sort of feel like this 310 00:10:55,270 --> 00:10:57,150 relation is too decomposed, because 311 00:10:57,500 --> 00:10:58,610 we can capture all of the 312 00:10:58,700 --> 00:11:00,080 information in one relation 313 00:11:00,590 --> 00:11:03,300 or a couple of relations still being in Boyce-Codd normal form. 314 00:11:03,790 --> 00:11:04,840 So one of the problems I 315 00:11:05,070 --> 00:11:05,980 actually like to give in my database 316 00:11:06,480 --> 00:11:08,080 class is to create 317 00:11:08,440 --> 00:11:10,530 a theory of composition that's 318 00:11:10,750 --> 00:11:11,660 sort of a compliment to the theory 319 00:11:12,000 --> 00:11:13,580 of decomposition, figuring out when 320 00:11:13,720 --> 00:11:15,650 you can take multiple relations and 321 00:11:15,890 --> 00:11:17,120 combine them while still 322 00:11:17,460 --> 00:11:18,990 staying in a desirable normal form. 323 00:11:19,260 --> 00:11:20,130 So you might give that a thought. 324 00:11:21,020 --> 00:11:22,370 So in conclusion, when designing a 325 00:11:22,430 --> 00:11:23,690 database schema, there are often 326 00:11:23,860 --> 00:11:25,000 many different designs that are 327 00:11:25,100 --> 00:11:25,970 possible, some of the designs 328 00:11:26,520 --> 00:11:27,630 are much better than others, and 329 00:11:27,690 --> 00:11:28,500 we have to have a some way 330 00:11:28,700 --> 00:11:31,270 of choosing the design we want to use for our application. 331 00:11:32,130 --> 00:11:33,200 We do have a very nice 332 00:11:33,540 --> 00:11:34,840 theory for relational database design 333 00:11:35,290 --> 00:11:37,310 that we've gone through in the past several videos. 334 00:11:38,220 --> 00:11:40,980 We define normal forms that tell us when we have good relations. 335 00:11:42,050 --> 00:11:43,080 We have a process for designing 336 00:11:43,600 --> 00:11:45,090 by decomposition, and it's usually 337 00:11:45,490 --> 00:11:47,130 quite intuitive and works well. 338 00:11:47,970 --> 00:11:49,700 There are however, some shortcomings to 339 00:11:49,770 --> 00:11:50,850 the normal forms that we've defined, 340 00:11:51,400 --> 00:11:52,550 and in this video we saw 341 00:11:52,760 --> 00:11:54,430 some specific shortcomings involving and 342 00:11:54,760 --> 00:11:56,640 forcing dependencies in decomposed relations, 343 00:11:57,700 --> 00:11:58,790 about considering what the query 344 00:11:59,110 --> 00:12:00,240 workload might look like when 345 00:12:00,420 --> 00:12:02,100 one does one's database design and 346 00:12:02,310 --> 00:12:05,220 finally, the possibility of over decomposing one's relations unnecessarily.