In this video we're going to learn about translating designs in the UML modelling inter-relational schemas. As a reminder if the application developer is using a high level design model then they will create their design usually using a graphical language that's sent to a translator and the translator will then produce a schema using the data model of the underline database management system, which frequently is relational. So, in our case we're looking at high level design in the UML data modeling subset which is then translated to relations and installed in a relational database system. It's this translation process that's the topic of this video. In the data model of subset of UML, we have five concepts that we learned in the previous video and we'll see now that designs using these concepts can be translated to relations automatically with one provision which is that every regular class has a key. And we'll see what we mean by a regular class in a moment. So, what we're going to do is go through each one of these concepts one at a time and show show how the concepts are translated to relations. For our examples we'll use primarily the same examples that we used in the previous video where we introduced UML. For classes will have a student class and a college class. Our students have a student ID which is the primary key, a student name and a GPA and our colleges will have a college name which we'll assume is unique, so that's the primary key, the state and the enrollment. The translation from classes to relations is absolutely direct we take the student class and we make a relation called student, we put the three attributes down for that relation and then the primary key from the student class becomes the key attribute in that relation. And similarly our college relation has the college name as a key, the state, and the enrollment. So all we're doing to turn classes into relation is basically turning them side ways. So now we're done with classes and let's move on to associations. We'll see that each association is going to be translated to its own relation that grabs a key from each side of the association so let's go to our example, the same one we used earlier, that has students applying to colleges. So, here's our diagram, with students applying to colleges and we'll already have generated the two relations for our classes, the student and the college relation and for the association applied between students and colleges we create a relation, we call it applied and that relation has attributes the keys from both sides so that would be the student ID from the student class and the college name from the college class. And if you think about it it makes a lot of sense. Applied is capturing relationships between students and colleges so we'll capture each of those relationships with one tuple in the applied relation identifying the student with the key and the college with the key. Now one question you might have is what is the key for a relation that we generate from an association, it turns out that depends on the concept of multiplicity that we learned about in the previous video so let me setup a generic example of two classes with an association and then and we'll talk about multiplicity and keys. Ok, so here we have both classes. We'll call them C1 and C2. And each one has a key attribute, we called it K1 and K2 over here and then one other attribute 01 and 02 and then we have an association A between the two classes. So in terms of their relations that will be generated, we'll have for C1, K1-01 with K1 as a key for C2 we'll have K2-02 with K2 as key and then for our association A, we'll grab the key from each side, K1 and K2 and the question we have now is, what is the key for relation A? Well, as a default the key can be both attributes together, but in certain cases we can be more specific, and again that's going to depend on the multiplicity on our association. Let's suppose that our multiplicity is 0..1 on the left hand side and a star on the right hand side and let's look at what that represents in terms of the type of relationship we're capturing. And, I'll make sure get it right this time, so what this is saying here is that if we have objects of C1 on the left hand side, each one can be related to many objects of C2 on the right hand side, but each object of C2 can be only be related to at most one object of C1, and some of them might not be related to any. So remember now, it's these edges that are being captured by the tuples in our relation for the association, and we can see that each element on the right hand side can only be involved in at most one tuple of A, so that tells us that K2 is actually a key for A. So when we have zero dot dot one or one dot dot one, for that matter. On the left side, or on one side of an association, then the key attribute from the other side is a key for the association. So lets test out that rule on our student and college association and see if it makes sense. So I've drawn the association here and we're interested in the relation that's going to be generated for the association itself which will be the applied relation with the student ID on one side from one side and the college name from the other, so that's the relation we're going to generate for the association and the question is what's key for that relation. Well, let's add the constraint that every student applies to exactly one college. So, that would be a one dot dot one on the right and then a star on the left if a college can have any number of applicants. So, our rule of thumb, it's actually a rule, said that if we have one dot dot one or zero dot one dot dot one on one side then the other side would be the key. So that would tell us that if we have this one dot dot one on the right that student ID would be a key for applied and indeed that makes complete sense if each student can only apply to one college, then, they will indeed be a in the applied relation. Now there's actually a related concept here where we might need a relation for associations at all and again, that depends on multiplicity and again let's start with a generic example. so here's our generic example and from this our standard translation would give us three relations, one for C1, one for C2 and then one capturing the association A which would have a key from each side. Now, what we're going to discover, is that in come cases it's actually possible to fold relations A into either C1 or C2 will end up with just two relation instead of three and will be capturing the same information. So let's suppose we have a 1..1 and on our left hand side so again we're going to have the situation where from the left side to the right we have a fan out so each one on the right is related to exactly one on the left. If that's the case, then instead of having a separate relation for the associations basically for capturing these edges here, we can simply take the related element up from the left, the key from the left and add it to the element for the right. Let me show how that's done. So what we'll do is we'll change this C2 to have K2 and O go to as before, but also to have the key from the left hand side of class, or relation so the key from C-1, and then we don't need A any longer. And we can see why that works, because every element in C2 is related to exactly one element in C-1, so we just add the key for that single element that is related to that relation. Now, what if the left hand side were zero dot dot one instead of one dot dot one? In that case, it would be possible for there to be some elements in C2 that aren't related to any elements in C1, and that would still be okay with this design as long as null values would be allowed in the third attribute of the relation for C-2. And finally, what's the key to this relation? Well, we knew before that the key for C2 was K2, just by definition of it being the primary key for the class, and that's still the case in the expanded C2. K2 will still be a key because we'll only have one instance of each object and the one K one that it would be related to through the association. So what we saw, to summarize, is that when we have zero dot dot one or one dot dot one on one aside and then we have an association and instead of making a relation for the association we can add the key from the side with the zero dot dot one or one dot dot one, to the relation that we generated for the class on the right hand side. So, let's take a look again with students and colleges just to confirm this intuition. So let's again suppose that a student applies to exactly one college and our rule of thumb says that we have a one dot dot one on one side, then we can take the key from that side and add it to the relation for the other side so we would be adding to the student relation, the college name attribute and we'd be getting rid of applied entirely. Let's just think about if that makes sense if every student is applying to exactly one college, it makes perfectly good sense to just add that college name to the student relation and further more if we had students just applying to either zero or one college that would still be an acceptable design provided that no values are allowed for the college name attribute Ok, we've finished with classes and with associations, now let's talk about associations classes. Association classes are pretty straight forward. We're not going to generate new relations more then we're just going to add attributes to relations that we're generating anyway, so let me use our usual example with the students and colleges and we'll add some information to their application. Ok, so, here's our UML diagram. We still have students applying to colleges and then we've added an association class, which as you remember is attached to an association and it gives extra attributes for that association. So, here we're saying when a student applies to a college, we'll have a date of that application and a decision of that application. The first scheme that I've shown down here is what we would get if we use the techniques we've already learned without the association class so the student class generates the student relation, college generates college and then the applied association generates a relation that has a key from each side. And by the way, right now let's say that there's no multiplicities or other information that's going to give us special keys so in that case the applied relation doesn't have any keys other than the two attributes together. And further more, because we have no special multiplicities, we won't be folding that relation into others as we showed in the previous example. So, now what do we do with our association class? Well, as you can probably guess, it's quite simple. All we do is extend our applied relation to include the attributes in the association class. So, we just add date and decision right here and then we're all set. Now there's actually a few things I want to mention at this point. First of all, I discussed in the video on UMO modeling. The fact that UMO assumes that when you have a relationship or an association between two classes, there's at most one instance of that association between any two elements of the class. And that is, can be seen quite well in this relational scheme up because we're assuming we have a most one relationship between any student and any college and then we have associated with that one relationship, the date and the decision. A second thing I wanted to talk about at this point in time is the fact that we for an automatic translation require that we have a key for every regular class. When I mentioned that at the beginning of the video, I wasn't able to motivate it very well but this example explains it pretty well. When we want automatic translation, the translation for an association requires a key from each side, and each side is going to be a regular class in that case. So, we needed to have a student ID or some key for the student class and we needed a key for the college class in order to capture the association. On the other hand, here the application info is not a regular class, this is an association class and we're just defining that as not regular and we don't need to have a key for this one because we're just adding their attributes to the relation for the association itself and the last thing that I wanted to mention is that the rules we saw for determining keys for the relation associated with associations and for folding in the class and relations for associations also work when we have an association class and it just goes naturally. You just bring the attributes with the association class along with the attributes that you have for the association. You might want to give that a try on an example just to see how it works but again it's pretty straight forward. Now let's take a look at how self associations are translated to relations. It actually follows exactly the same rules but it's worth while looking at to see exactly what happens. So here's our first example of association which was sibling relationships between students. So in this case, we generate the relation for the class as usual, student ID, name and GPA and then we generate a relation for our association just following the same that we already had when we had an association between two different classes so we need to grab one key from each side, well it's the same side that we're grabbing from so the one thing we need to do is just change the attribute names so that we have two instances of the key attribute from the class, but they have different names. So you can see what we're doing here. We're just saying that we have sibling relationship. We take the student ID from the two students who are siblings, and that pair becomes a tuple in our sibling relation and because so we have stars on both sides so we're not making any assumptions about multiplicities. We don't have any zero dot dot ones or one dot dot ones. Then we don't have any key for the sibling relation except the two attributes together. Now, the other example we had in self-association, was colleges being branches of other colleges. So here's our college relation that we generate with the usual attributes and then we have our, that's the key there. We have our association which we're calling branch which takes a home campus and a satellite campus and since we've labeled those two sides we can actually use those labels as the attributes for the relation we generate. So we'll call it branch and we'll have home and satellite. So that's saying that the satellite, the campus on the, the college on the right hand side is a satellite of the one on the left hand side. Now these values here will be college names that will be the domain that we're drawing from but we'll calling home and satellite to distinguish their roles in the association. Now what about keys? We said before that every campus, every satellite campus has exactly one home campus so we write it that way. And that a home campus could have, I think we said something like zero to ten satellite campuses but in terms of generating our relations the only really important thing is this one to one and if you remember what our rules said, it said that if we have one one to one on one side then the other side is a key in the association relation. So what that would tell us is that satellite is a key here and that does make sense, so we'll only have each satellites home campus listed once and, but a home might have many satellites so the left hand would not be a key. Ok, we're getting there. We finished the first three and now it's time for sub classes. This is a pretty big one. First of all, it turns out that there's well atleast three and three commonly use translations from a sub class relationship into relations so let me setup a generic sub classing setup and then we'll talk about the three different translation and when we might use which one. So, here's our generic set up, I had to draw it a little funny to fit it in the space. S is our super class and then we have two sub classes S1 and S2 and just as a reminder what this says is that we have object of type S and they have a key attribute call K and another attribute call A and then we have sub classes of those, one called S1 which will have attributes K and A and will also have an attribute B and the difference to the sub class as two that will have attributes K and A inherited from its parent and then an additional attribute C. Now there are three different ways, as I said, atleast three that we could translate this UML diagram to relations. In the first way, we have a relation for each of our classes including the sub-classes and the sub-class relations will contain their own attributes, plus the key of the super class. In the second case, we still have one relation for each of our three classes, but in this case the sub-class relation contains all of the attributes from the super class and then the third case we just use one mega relation for the whole thing that contains all of the super class and sub-class attributes. So let's take a look at what the three different translations would produce for this particular setup. For the first one we'll have our relation for S that's going to have K and A, then we'll have one relation for each of the two sub-classes, S1, that will contain the key for the super class, so it will be K and that will still be a key here for the sub-class and B. So, to find all components of an object of S1 will get it's A competent from the super class, following that key, and then, we'll get the B component. from this relation and then finally say S2 will be similar, it will have the key for the super class and attribute C, again what this will require is some assembly to get all attributes of the sub classes. Now, let's look at the second translation. We still have the super class, and our sub classes now are going to contain all of the attributes that are relevant so they'll have the key attribute from the super class. They'll also have attribute A and attribute B and then S2 will similarly have K and then A and then C. So in this case what we can see is that when we have an object that's in a sub class all of it's information will be in it's sub class relation and the super class will only be used for objects that are in the super class but not in any of the sub classes. And the third translation says, let's just make a mega relation that contains everything. So, it will have the key attribute K and A and B and C. And in that case, we can see that we may have some null values, so if we have an object that's only in sub class as one for example, it would then have a null attribute C. If we add an object that was only in the super class and not in any of the sub classes it would have a null for both B and C. Now as you may have noticed, what has already been on the slide the entire time is that the best translation may depend on the properties of the sub class relationship and if you remember, we had the properties overlapping versus disjoint, and complete versus incomplete. So overlapping meant that we might have objects that were in more than one sub class, disjoint meant that's not possible, complete said that every object that's in the super class and the super class is also in at least one of the sub classes, where incomplete said that there would be some objects, or they could be they're in the super class and not in any sub classes. Sort of, alluded to how some of those properties affect the different translations. But, let's look at it very specifically. Let's suppose, for example, that we had a heavily overlapping sub class relationship. In other words, we have many objects that are in multiple subclasses; if that's the case then we might prefer design three because design three captures all of the different attributes in one place so if many objects have all those attributes we might like to just have them together rather than needing to assemble them from the different pieces. On the other hand, let's say that we have have a disjoint and furthermore, let's say it's complete. So, we have every element being in exactly one subclass and there are no elements elements that are in just the super class. So in that case, we might like to use design two because design two puts each object in individual subclass relation and furthermore, since it's complete, we could actually, in this particular case, get rid of the first relation. We wouldn't need that, and so the best design would be number 2, modified to only have the subclass relations. So, now, let's revisit our gigantic example from the UML modeling video and see how that would be translated to relations. So, just to remind you what's going on in this big diagram we have students, and students have a student ID and a name and then some of our students might be foreign students and they have a country. Some of our students may be domestic students, and that they would have a state and a social security number. We'll assume that every student is either foreign or domestic, and then some of our students are AP students and interestingly, they have no attributes but, those are the ones who have an association called took with AP courses AP courses have a course number and a title and then, when a student takes an AP course, there's an association class that says the year they took it and the grade they got. So, let's translate this to relations; the whole diagram. And we're going to use the first translation from our three schemes. So, that's the one where the sub-class relations contain the super class keys and then all the specialized attributes. So, let's start by generating the relation for the student class and that's straightforward, that's just the student ID and the student name, with the student ID being the key. And, then we're going to generate one relation for each of our three sub-classes. So that would be the foreign students which will take the key then from the student relations, so their student ID and the country that they're from. They'll be the domestic students and there we'll have again, the student ID, inherited from the super class and the state they're from and their social security number. And, then we'll also have the AP students and those have none of their own attributes but, we can get the key, and we do, from the super class. So, this is just going to be a list of the student ID's who are AP students. Now, let's keep going. So, let's take a look at AP students taking courses. So, we'll have, from the AP class, a straight forward translation to the course number and the title of the course and then, finally, we're going to have a relation for the fact that a AP student took classes. We had some multiplicities on that the first time but, neither of them were a zero one or one one so they're not going to be relevant in changing our design. So, took is going to have the key from both sides. Well, we don't see any key in here do we? But, subclasses always inherit the key from their parents, so the key here is implicitly the student ID and, that's what we'll be using here. Student ID, the key from the right hand side, the course number, that they've taken and then we'll have the two attributes from the association class, the year and the grade. So this actually looks like a pretty good design. Let me just make a few comments. The first one has to do with the AP student relation. So if every AP student does have to, does have to take at least one course. So, if we have, for example, a one dot dot something over there, then, we could actually eliminate it's relation because every student ID that appears in AP student, will also appear in the Took relation, so, this one will be redundant in that case. So, again, that could be eliminated based on the multiplicity, though, I wouldn't really expect an automatic translator to necessarily figure that one out. Another possibility I should mention, is that we could have a primary key specified here for domestic students for social security numbers since we would expect that to be unique, and that would translate to a key here, but let me be clear, this would be a separate key. It wouldn't be social security number and student ID together, but these would be two different keys and in SQL you can actually distinguish between those two, but we can't do it the way we've written the relations here. And, finally let me mention again that requirement that we have keys for regular classes if we want to do translation automatically and you might have noticed that for example, we certainly didn't have a key here for a foreign student with country being the only attribute, but subclasses are also not considered regular classes. So subclasses and association classes don't need to have keys and we can still have an automatic translation. Logically subclasses are inheriting their keys from their super-class and that one does need to have a key in order to have automatic translation. So that was a big one. But you'll be glad to know that composition and aggregation is going to be quite quick. So here is our example where we have our college class as usual, but colleges contain departments, and this solid diamond here is the composition operator that says that we have objects from the right hand side class that are components of the left hand side objects. So let's look at the translation to relations. We'll translate the college side as usual with a key attribute - college name and the other attributes. And then although this is an association, we're not going to have a separate class for it, we're going to have the right hand side class have both the values of the object in that class and the association captured together. So we'll have a relation called department, and it will have the department name and the building and then it will also include, the key of the object it belongs to. So in this case, that would be a college name. Now if you think carefully about it, this translation actually makes a lot of sense and is consistent with what we've already done. We said when we introduced the notion of composition that we have Effectively have by definition a one dot dot, one on the left hand side of the composition. So if we treated this as a regular association, used our regular translation and then used our rule that allowed us to get rid of association relations, we'd actually end up at exactly the same design. By the way this is a last example where we have a class that's not one of those regular classes that require a key. We don't have to have a key for the right-hand side of a composition in order to have an automatic translation. Aggregation, by the way, was the case where we have the MC diamond and that's implicitly a zero dot dot one instead of one dot dot one. We had an example with apartment buildings, I'm not going to bother to give that one again, but in that case where we have an aggregation should have in an association. All we need is the same design but the ability for that key we're grabbing from the left hand side to be null, and, then, everything works out fine. So, to conclude, the data modeling portion of UML, is a popular high level language for data base designs. It's graphical and it can be translated automatically to lead to relations, as long as every regular class has a key, and that typically is the case.