1
00:00:00,570 --> 00:00:01,910
The last several videos talked about

2
00:00:02,260 --> 00:00:03,620
Boyce-Codd Normal form and Fourth

3
00:00:03,970 --> 00:00:06,300
Normal form as describing good relations.

4
00:00:07,330 --> 00:00:09,030
As a remainder Boyce-Codd normal form

5
00:00:09,330 --> 00:00:11,010
is based on functional dependencies and

6
00:00:11,270 --> 00:00:12,430
it is says whenever we have

7
00:00:12,620 --> 00:00:13,980
functional dependency on a relation

8
00:00:14,560 --> 00:00:15,860
the left hand side needs to

9
00:00:15,970 --> 00:00:17,420
be a key or contain a key.

10
00:00:18,220 --> 00:00:19,620
Fourth normal form uses multi

11
00:00:20,150 --> 00:00:21,600
value dependencies and says

12
00:00:21,870 --> 00:00:23,240
whenever we have a

13
00:00:23,280 --> 00:00:24,810
non-trivial multi value dependency,

14
00:00:25,750 --> 00:00:27,190
again the left-hand side is a key.

15
00:00:27,780 --> 00:00:29,450
And as a reminder, functional dependencies

16
00:00:30,180 --> 00:00:31,400
are also multi value dependencies,

17
00:00:32,300 --> 00:00:35,100
so Fourth normal form implies Boyce-Codd normal form.

18
00:00:36,640 --> 00:00:37,610
What we can see in this video is a few

19
00:00:37,910 --> 00:00:39,430
examples where Boyce-Codd Normal

20
00:00:39,790 --> 00:00:41,310
form or Fourth normal are actually

21
00:00:41,580 --> 00:00:45,320
not necessarily desirable for the specific relations we are designing.

22
00:00:46,700 --> 00:00:47,890
As usual we'll be using our

23
00:00:48,180 --> 00:00:50,260
college application information database, although

24
00:00:50,490 --> 00:00:51,980
each of our examples will be a bit different.

25
00:00:52,800 --> 00:00:55,390
For our first example let's take a look at the relation.

26
00:00:55,990 --> 00:00:56,990
And let's suppose that we

27
00:00:57,110 --> 00:00:58,540
have the student's Social Security number,

28
00:00:58,920 --> 00:00:59,780
the college that they are applying

29
00:01:00,230 --> 00:01:00,940
to, the date of the their application

30
00:01:01,590 --> 00:01:03,030
and the major they are applying for.

31
00:01:03,760 --> 00:01:05,010
And further more, let's say that

32
00:01:05,390 --> 00:01:06,740
each student can apply to each

33
00:01:06,900 --> 00:01:08,620
college only once and for one major.

34
00:01:09,180 --> 00:01:10,600
So that would be on one date for major.

35
00:01:11,830 --> 00:01:13,150
Furthermore, let's suppose that all

36
00:01:13,430 --> 00:01:14,620
of the colleges in our database

37
00:01:15,060 --> 00:01:17,020
have non-overlapping application dates.

38
00:01:17,580 --> 00:01:19,630
And this is obviously contrived for

39
00:01:19,840 --> 00:01:21,700
the specific example that we want to demonstrate.

40
00:01:22,870 --> 00:01:24,710
So under these assumptions,

41
00:01:24,860 --> 00:01:26,760
the functional dependencies for our

42
00:01:26,920 --> 00:01:28,400
relation will be that

43
00:01:28,640 --> 00:01:30,080
the social security number and

44
00:01:30,240 --> 00:01:33,300
the college name determine the date and the major.

45
00:01:34,090 --> 00:01:35,920
And so that's based on our first assumption.

46
00:01:37,050 --> 00:01:38,080
And we have one more functional dependency,

47
00:01:38,970 --> 00:01:40,920
which is that the date determines the

48
00:01:41,030 --> 00:01:42,420
college name and that's based

49
00:01:42,810 --> 00:01:44,000
on our second assumption.

50
00:01:44,800 --> 00:01:45,970
So if we have two tuples

51
00:01:46,350 --> 00:01:47,300
with the same date, they have to

52
00:01:47,410 --> 00:01:48,770
be the same college because colleges

53
00:01:49,220 --> 00:01:50,330
have non-overlapping dates.

54
00:01:51,310 --> 00:01:52,840
So using these functional dependencies

55
00:01:53,620 --> 00:01:54,580
we can determine that the key

56
00:01:55,050 --> 00:01:56,500
for the relation is Social Security

57
00:01:56,980 --> 00:01:58,070
number and college name together,

58
00:01:58,830 --> 00:02:00,620
since they determine the other two attributes.

59
00:02:01,600 --> 00:02:03,640
So is this relation in Boyce-Codd normal form?

60
00:02:03,900 --> 00:02:05,390
The answer is no, and the

61
00:02:05,450 --> 00:02:06,550
reason is the second functional

62
00:02:06,960 --> 00:02:09,300
dependency which does not have key on its left hand side.

63
00:02:10,140 --> 00:02:11,130
So let's follow the decomposition

64
00:02:11,880 --> 00:02:13,530
process and create two relations

65
00:02:14,320 --> 00:02:15,160
from our big relation.

66
00:02:16,070 --> 00:02:17,490
The first one will have the

67
00:02:18,070 --> 00:02:20,040
attributes from the violating functional dependency.

68
00:02:21,130 --> 00:02:22,020
And the second one will have

69
00:02:22,170 --> 00:02:25,210
all the remaining attributes plus the left hand side of the functional dependency.

70
00:02:26,130 --> 00:02:29,110
So that would be the social security number, the date, and the major.

71
00:02:30,320 --> 00:02:31,710
So did we think this is a good design?

72
00:02:33,270 --> 00:02:34,370
It certainly is in

73
00:02:34,440 --> 00:02:35,400
Boyce-Codd normal form but I would

74
00:02:35,610 --> 00:02:37,190
argue it is not really a good design.

75
00:02:38,280 --> 00:02:40,470
We do correctly factor out

76
00:02:40,750 --> 00:02:43,650
the information connecting the application dates and the colleges.

77
00:02:44,330 --> 00:02:45,650
On the other hand, we've now

78
00:02:46,060 --> 00:02:47,890
separated a student's application,

79
00:02:48,880 --> 00:02:49,820
the date, and the major of

80
00:02:49,890 --> 00:02:51,190
their application from the college

81
00:02:51,660 --> 00:02:53,050
they're applying to, so even

82
00:02:53,280 --> 00:02:54,570
to check this first functional

83
00:02:54,980 --> 00:02:56,230
dependency here, would require

84
00:02:56,800 --> 00:02:58,640
us to join these two relations together.

85
00:02:59,800 --> 00:03:00,880
Intuitively we might just prefer

86
00:03:01,330 --> 00:03:03,110
to keep that information together, even

87
00:03:03,400 --> 00:03:06,040
if we are capturing some information multiple times.

88
00:03:06,710 --> 00:03:09,550
So, I'd say not necessarily good design here.

89
00:03:10,110 --> 00:03:11,800
And there is another normal form,

90
00:03:12,140 --> 00:03:13,780
it's called third normal form

91
00:03:14,030 --> 00:03:14,970
and I'm not going to define

92
00:03:15,490 --> 00:03:16,730
it here, but third normal

93
00:03:17,040 --> 00:03:18,970
form actually allows this relation

94
00:03:19,460 --> 00:03:20,550
here to stay as is

95
00:03:20,730 --> 00:03:21,990
without being decomposed and you

96
00:03:22,160 --> 00:03:23,310
can read about third normal form

97
00:03:23,560 --> 00:03:24,890
in any of the readings mentioned on the website.

98
00:03:26,260 --> 00:03:27,440
Just to put third normal form

99
00:03:27,700 --> 00:03:30,340
in context, we'll go back to our Venn diagram.

100
00:03:30,860 --> 00:03:32,370
Remember that fourth normal form

101
00:03:32,690 --> 00:03:34,920
is stronger than Boyce-Codd Normal Form.

102
00:03:35,120 --> 00:03:36,250
So everything that's in fourth

103
00:03:36,500 --> 00:03:37,540
normal form is also in Boyce-Codd

104
00:03:38,000 --> 00:03:39,450
Normal Form, and third normal

105
00:03:39,770 --> 00:03:42,510
form is even weaker than Boyce- Codd Normal Form.

106
00:03:43,090 --> 00:03:44,570
So if we have a

107
00:03:44,810 --> 00:03:45,950
relation that's in third normal

108
00:03:46,380 --> 00:03:48,270
form it's not necessarily in

109
00:03:48,360 --> 00:03:49,380
Boyce-Codd Normal Form and here's an

110
00:03:49,460 --> 00:03:50,550
example of one that's not,

111
00:03:51,160 --> 00:03:52,120
but if we have something in

112
00:03:52,230 --> 00:03:54,130
Boyce-Codd Normal Form it is in third normal form.

113
00:03:54,340 --> 00:03:57,210
And again, you can read about third normal form on the readings from the website.

114
00:03:58,330 --> 00:04:00,440
For our second example lets go back to our student relation.

115
00:04:00,750 --> 00:04:01,400
And now we're going to have a

116
00:04:01,440 --> 00:04:02,830
student's Social Security number, the

117
00:04:03,020 --> 00:04:03,950
high schools that they've attended,

118
00:04:04,380 --> 00:04:06,640
their GPA and their priority for admission.

119
00:04:07,600 --> 00:04:08,870
And let's assume that students might

120
00:04:09,240 --> 00:04:10,540
attend multiple high schools, and

121
00:04:10,830 --> 00:04:12,770
that the priority is determined from the GPA.

122
00:04:13,770 --> 00:04:15,110
So in that case our functional

123
00:04:15,480 --> 00:04:17,410
dependencies in this case are

124
00:04:17,620 --> 00:04:19,100
social security number determines GPA.

125
00:04:19,560 --> 00:04:21,460
We're going to assume that students have one GPA.

126
00:04:22,240 --> 00:04:23,830
We'll have GPA determines priority

127
00:04:25,360 --> 00:04:26,200
and by the way we also

128
00:04:26,540 --> 00:04:27,660
have using the transitivity rule,

129
00:04:28,220 --> 00:04:29,900
social security number determines priority.

130
00:04:31,400 --> 00:04:32,430
The key for this

131
00:04:32,660 --> 00:04:33,770
relation is the combination

132
00:04:34,740 --> 00:04:37,260
of social security number and high school name.

133
00:04:37,480 --> 00:04:38,420
Because students could go to

134
00:04:38,480 --> 00:04:39,860
multiple high schools, but once

135
00:04:40,190 --> 00:04:41,930
we have the social security number, we

136
00:04:42,080 --> 00:04:43,830
do functionally determine through these

137
00:04:44,080 --> 00:04:45,900
functional dependencies, the remaining

138
00:04:46,610 --> 00:04:47,870
attributes, the priority and the GPA.

139
00:04:49,170 --> 00:04:50,280
So is this relation in BCNF?

140
00:04:51,030 --> 00:04:51,910
No, it's clearly not.

141
00:04:52,740 --> 00:04:53,930
And in fact, all three of

142
00:04:54,020 --> 00:04:55,940
our functional dependencies violate BCNF.

143
00:04:56,760 --> 00:04:57,830
So let's go ahead and do the decomposition.

144
00:04:58,850 --> 00:05:00,620
So we'll decompose into S1 and S2.

145
00:05:00,960 --> 00:05:03,320
Let's start with the social security number determines priority.

146
00:05:04,320 --> 00:05:07,060
So then we'll have social security number and priority in S1.

147
00:05:07,270 --> 00:05:09,260
And in the other one, we'll have

148
00:05:09,530 --> 00:05:11,100
all the remaining attributes and the

149
00:05:11,180 --> 00:05:12,320
left hand side - high

150
00:05:12,480 --> 00:05:14,480
school name and that would be the GPA.

151
00:05:15,890 --> 00:05:17,160
Then the second one still

152
00:05:17,410 --> 00:05:18,950
needs to be decomposed further, so

153
00:05:19,200 --> 00:05:21,620
that one will become S3 and S4.

154
00:05:22,750 --> 00:05:23,820
S3 will have the social

155
00:05:24,180 --> 00:05:25,350
security number and the GPA

156
00:05:25,710 --> 00:05:27,950
based on the first

157
00:05:28,150 --> 00:05:29,490
functional dependency and then

158
00:05:29,690 --> 00:05:30,780
S4 will have the social

159
00:05:31,100 --> 00:05:32,740
security number and the high school name.

160
00:05:33,010 --> 00:05:33,590
And then we take away S2.

161
00:05:33,850 --> 00:05:34,230
Now, do we think this is a good design?

162
00:05:34,280 --> 00:05:34,510
Again, I would say not necessarily.

163
00:05:34,560 --> 00:05:35,810
It doesn't quite feel right; for one thing we've lost the GPA to priority functional dependency.

164
00:05:49,720 --> 00:05:50,760
We can join the two relations

165
00:05:51,270 --> 00:05:52,660
S1 and S3 to check that

166
00:05:52,840 --> 00:05:54,130
dependency, but it would

167
00:05:54,500 --> 00:05:55,690
be nice if that dependency were

168
00:05:55,800 --> 00:05:56,960
together in a relation and

169
00:05:57,140 --> 00:05:58,330
in fact, in this case

170
00:05:58,860 --> 00:06:00,220
we can have a Boyce-Codd normal

171
00:06:00,680 --> 00:06:02,740
form relation that does maintain that dependency.

172
00:06:03,610 --> 00:06:04,850
So if we had made S1

173
00:06:05,140 --> 00:06:07,770
have GPA and priority

174
00:06:08,100 --> 00:06:09,680
together, then we wouldn't

175
00:06:09,870 --> 00:06:11,020
have needed S3 and we're actually

176
00:06:11,530 --> 00:06:12,770
still in Boyce-Codd normal form.

177
00:06:13,270 --> 00:06:14,140
So that would would have

178
00:06:14,300 --> 00:06:15,370
happened if we had started

179
00:06:15,900 --> 00:06:17,540
with the functional dependency which is

180
00:06:17,690 --> 00:06:19,050
also implied, that was

181
00:06:19,310 --> 00:06:21,500
SSN to GPA and priority.

182
00:06:22,290 --> 00:06:24,410
Actually way back in the decomposition algorithm.

183
00:06:24,870 --> 00:06:26,770
I mentioned that sometimes when people

184
00:06:27,250 --> 00:06:28,420
use that algorithm they try

185
00:06:28,600 --> 00:06:30,140
to start with the biggest functional

186
00:06:30,470 --> 00:06:32,970
dependencies they can to get a better end design.

187
00:06:33,810 --> 00:06:34,710
And this is a case where we

188
00:06:34,850 --> 00:06:36,490
would want to do that if

189
00:06:36,630 --> 00:06:38,920
we prefer to end up with a design with just these two relations.

190
00:06:39,380 --> 00:06:40,710
So overall with the first two relations that illustrated

191
00:06:40,840 --> 00:06:44,710
when you have Boyce-Codd Normal form

192
00:06:44,910 --> 00:06:46,400
and Fourth Normal form it's possible

193
00:06:46,990 --> 00:06:48,870
that after the decomposition there's no

194
00:06:49,230 --> 00:06:50,210
guarantee that all of our

195
00:06:50,370 --> 00:06:52,770
original dependencies can be checked on the decomposed relations.

196
00:06:53,520 --> 00:06:56,350
They may require joins of those relations in order to check them.

197
00:06:57,120 --> 00:06:58,810
Now, lets look at a different type of example.

198
00:06:59,890 --> 00:07:00,750
Let's suppose that we have

199
00:07:00,960 --> 00:07:02,260
a relation that contains the scores

200
00:07:02,860 --> 00:07:03,870
of students who are applying to

201
00:07:04,130 --> 00:07:05,170
college, so we'll have the

202
00:07:05,230 --> 00:07:06,970
student's Social Security numbers for and their name.

203
00:07:07,700 --> 00:07:08,990
And then SAT scores and

204
00:07:09,500 --> 00:07:10,250
ACT scores and they might

205
00:07:10,630 --> 00:07:12,040
have many SAT's and ACT's.

206
00:07:13,500 --> 00:07:15,200
So the only functional dependency we'd

207
00:07:15,330 --> 00:07:16,840
have in that case is

208
00:07:16,920 --> 00:07:18,150
the one that says social security

209
00:07:18,660 --> 00:07:20,120
number determines the student

210
00:07:20,570 --> 00:07:21,940
name, and as far

211
00:07:22,140 --> 00:07:23,720
as keys go there's no key.

212
00:07:24,550 --> 00:07:24,890
Now in this example we do have a multi-valued dependency.

213
00:07:26,000 --> 00:07:27,710
If a student has several

214
00:07:28,220 --> 00:07:29,780
SAT and ACT scores we

215
00:07:29,850 --> 00:07:31,430
can assume they're independent of each other.

216
00:07:32,160 --> 00:07:33,260
And so that's captured in the

217
00:07:33,380 --> 00:07:34,690
multi-valued dependency that says

218
00:07:35,550 --> 00:07:36,760
for a given student, identified

219
00:07:37,480 --> 00:07:38,840
by their social security number and

220
00:07:38,950 --> 00:07:41,060
name, if they

221
00:07:41,180 --> 00:07:42,420
have a set of SAT scores,

222
00:07:43,350 --> 00:07:46,210
those SAT scores are independent of the rest of the attributes.

223
00:07:46,600 --> 00:07:47,750
You remember that from the

224
00:07:47,910 --> 00:07:49,270
MVD video, which in

225
00:07:49,390 --> 00:07:51,230
this case would be ACT.

226
00:07:51,820 --> 00:07:53,130
In that case, we'll have for each student every

227
00:07:53,500 --> 00:07:55,690
combination of their SAT and ACT scores.

228
00:07:56,640 --> 00:07:58,520
So is this relations in fourth normal form?

229
00:07:58,900 --> 00:08:00,340
Well, clearly it's not;  we have

230
00:08:00,710 --> 00:08:02,520
a violating multi-value dependency here

231
00:08:02,670 --> 00:08:03,500
since it doesn't have a key

232
00:08:03,600 --> 00:08:06,010
on the left hand side and we also have a violating functional dependency.

233
00:08:07,050 --> 00:08:08,100
So let's quickly do the decomposition.

234
00:08:09,080 --> 00:08:10,110
We'll start with the multi-value dependency

235
00:08:11,080 --> 00:08:12,140
so that would give us in

236
00:08:12,400 --> 00:08:13,320
the first relation the attributes of

237
00:08:13,350 --> 00:08:16,070
the multi-value dependency and

238
00:08:16,210 --> 00:08:17,670
then it will give us the attributes

239
00:08:18,590 --> 00:08:20,750
that are remaining plus the ones on the left hand side.

240
00:08:21,070 --> 00:08:23,410
So social security number, student name, and ACT.

241
00:08:23,640 --> 00:08:26,050
So now we're done with the multivalue dependency.

242
00:08:26,800 --> 00:08:27,960
Actually in this case because it's now

243
00:08:28,260 --> 00:08:30,110
a trivial multivalue dependency when

244
00:08:30,270 --> 00:08:32,670
it covers all attributes, and

245
00:08:32,900 --> 00:08:33,670
now let's take a look at

246
00:08:33,730 --> 00:08:35,140
the functional dependency that's still

247
00:08:35,470 --> 00:08:36,520
a violation because the social

248
00:08:36,830 --> 00:08:38,180
security number is not a

249
00:08:38,360 --> 00:08:40,460
key on the left hand side for either of these relations.

250
00:08:41,460 --> 00:08:43,550
So, we'll split S1 into

251
00:08:44,180 --> 00:08:45,130
S3 and S4.

252
00:08:45,720 --> 00:08:46,810
S1 will continue the Social

253
00:08:46,990 --> 00:08:48,580
Security number and student name,

254
00:08:49,490 --> 00:08:50,950
and S4 will contain the Social

255
00:08:51,270 --> 00:08:52,560
Security number and the SAT.

256
00:08:53,330 --> 00:08:55,770
And actually similarly need to split S2.

257
00:08:55,990 --> 00:08:57,960
And that will give S5,

258
00:08:58,890 --> 00:09:00,300
which contains the Social Security

259
00:09:00,820 --> 00:09:03,040
number and the ACT score.

260
00:09:03,930 --> 00:09:05,870
And so now regard of S1 and S2.

261
00:09:06,210 --> 00:09:07,780
An just in case you're getting

262
00:09:07,960 --> 00:09:09,740
confused at this point this is our final schema. And

263
00:09:10,790 --> 00:09:13,590
this schema is in fourth normal form.

264
00:09:14,170 --> 00:09:17,020
And that's good, we've separated out all the relevant facts.

265
00:09:17,790 --> 00:09:19,700
So actually it feels like a pretty good schema.

266
00:09:20,770 --> 00:09:22,070
But let's suppose now that all

267
00:09:22,350 --> 00:09:23,970
of the the queries that we're going to run on these relations.

268
00:09:24,620 --> 00:09:26,680
Take a social security number and

269
00:09:26,900 --> 00:09:27,930
what they return is the student's

270
00:09:28,350 --> 00:09:29,800
name and some composite score

271
00:09:30,080 --> 00:09:31,560
that's based on all of their

272
00:09:31,830 --> 00:09:32,900
SAT and ACT scores and

273
00:09:33,080 --> 00:09:35,050
again let's suppose every single query does that.

274
00:09:35,780 --> 00:09:37,120
If that's the case, then every

275
00:09:37,490 --> 00:09:38,730
query is going to have to-recombine

276
00:09:40,130 --> 00:09:42,690
all of these three relations in order to compute it's results.

277
00:09:43,580 --> 00:09:44,590
So, we might have been better

278
00:09:44,860 --> 00:09:45,600
off, even if it included

279
00:09:46,190 --> 00:09:47,480
some redundancy and some

280
00:09:47,660 --> 00:09:49,320
additional storage with our

281
00:09:50,060 --> 00:09:52,430
original design because each query again is going to access all of the data in that one relation.

282
00:09:52,440 --> 00:09:52,470
So, there is something called a de-normalize relation.

283
00:10:02,100 --> 00:10:03,220
And when queries tend to

284
00:10:03,510 --> 00:10:04,840
access all attributes, when

285
00:10:05,130 --> 00:10:06,060
every query would be reassembling

286
00:10:06,780 --> 00:10:08,250
the normalized relations, it actually

287
00:10:08,560 --> 00:10:11,570
might be preferable to use one that is quote "de-normalized".

288
00:10:13,080 --> 00:10:14,070
And here's our final example.

289
00:10:14,590 --> 00:10:16,010
This time let's talk about college information.

290
00:10:16,640 --> 00:10:18,300
So, we have a college and the state it's in.

291
00:10:19,220 --> 00:10:19,870
Let's suppose we have some other information

292
00:10:20,400 --> 00:10:21,160
and we decide to put it in

293
00:10:21,270 --> 00:10:22,580
separate relations We have

294
00:10:22,810 --> 00:10:23,880
the size of the college, the

295
00:10:23,940 --> 00:10:24,900
number of students in the

296
00:10:24,970 --> 00:10:26,080
college, and we might

297
00:10:26,320 --> 00:10:27,890
have the average SAT score for

298
00:10:28,040 --> 00:10:29,620
each college, the average GPA for

299
00:10:29,720 --> 00:10:31,190
each college, and maybe some

300
00:10:31,330 --> 00:10:33,990
additional relations, each containing one fact about the college.

301
00:10:34,800 --> 00:10:36,090
So all of these relations in

302
00:10:36,240 --> 00:10:37,590
Boyce-Codd Normal form and First

303
00:10:37,690 --> 00:10:39,130
Normal form, yes they are,

304
00:10:39,450 --> 00:10:40,540
actually for each relation we

305
00:10:40,700 --> 00:10:42,130
have a functional dependency from the

306
00:10:42,190 --> 00:10:44,080
college name to the other attribute but the left side will be a key so were in a good shape.

307
00:10:50,310 --> 00:10:51,100
Is this a good design? once again I say not necessarily, we

308
00:10:51,260 --> 00:10:53,840
may not want to decompose so much.

309
00:10:54,180 --> 00:10:55,080
I sort of feel like this

310
00:10:55,270 --> 00:10:57,150
relation is too decomposed, because

311
00:10:57,500 --> 00:10:58,610
we can capture all of the

312
00:10:58,700 --> 00:11:00,080
information in one relation

313
00:11:00,590 --> 00:11:03,300
or a couple of relations still being in Boyce-Codd normal form.

314
00:11:03,790 --> 00:11:04,840
So one of the problems I

315
00:11:05,070 --> 00:11:05,980
actually like to give in my database

316
00:11:06,480 --> 00:11:08,080
class is to create

317
00:11:08,440 --> 00:11:10,530
a theory of composition that's

318
00:11:10,750 --> 00:11:11,660
sort of a compliment to the theory

319
00:11:12,000 --> 00:11:13,580
of decomposition, figuring out when

320
00:11:13,720 --> 00:11:15,650
you can take multiple relations and

321
00:11:15,890 --> 00:11:17,120
combine them while still

322
00:11:17,460 --> 00:11:18,990
staying in a desirable normal form.

323
00:11:19,260 --> 00:11:20,130
So you might give that a thought.

324
00:11:21,020 --> 00:11:22,370
So in conclusion, when designing a

325
00:11:22,430 --> 00:11:23,690
database schema, there are often

326
00:11:23,860 --> 00:11:25,000
many different designs that are

327
00:11:25,100 --> 00:11:25,970
possible, some of the designs

328
00:11:26,520 --> 00:11:27,630
are much better than others, and

329
00:11:27,690 --> 00:11:28,500
we have to have a some way

330
00:11:28,700 --> 00:11:31,270
of choosing the design we want to use for our application.

331
00:11:32,130 --> 00:11:33,200
We do have a very nice

332
00:11:33,540 --> 00:11:34,840
theory for relational database design

333
00:11:35,290 --> 00:11:37,310
that we've gone through in the past several videos.

334
00:11:38,220 --> 00:11:40,980
We define normal forms that tell us when we have good relations.

335
00:11:42,050 --> 00:11:43,080
We have a process for designing

336
00:11:43,600 --> 00:11:45,090
by decomposition, and it's usually

337
00:11:45,490 --> 00:11:47,130
quite intuitive and works well.

338
00:11:47,970 --> 00:11:49,700
There are however, some shortcomings to

339
00:11:49,770 --> 00:11:50,850
the normal forms that we've defined,

340
00:11:51,400 --> 00:11:52,550
and in this video we saw

341
00:11:52,760 --> 00:11:54,430
some specific shortcomings involving and

342
00:11:54,760 --> 00:11:56,640
forcing dependencies in decomposed relations,

343
00:11:57,700 --> 00:11:58,790
about considering what the query

344
00:11:59,110 --> 00:12:00,240
workload might look like when

345
00:12:00,420 --> 00:12:02,100
one does one's database design and

346
00:12:02,310 --> 00:12:05,220
finally, the possibility of over decomposing one's relations unnecessarily.