Now let's turn to the subject of querying XML. First of all, let me say right up front that querying XML is not nearly as mature as querying relational data bases. And there is a couple of reasons for that. First of all it's just much, much newer. Second of all it's not quite as clean, there's no underlying algebra for XML that's similar to the relational algebra for querying relational data bases. Let's talk about the sequence of development of query languages for XML up until the present time. The first language to be developed was XPath. XPath consists of path expressions and conditions and that's what we'll be covering in this video once we finish the introductory material. The next thing to be developed was XSLT. XSLT has XPath as a component but it also has transformations, and that's what the T stands for, and it also has constructs for output formatting. As I've mentioned before, XSLT is often used to translate XML into HTML for rendering. And finally, the latest language and the most expressive language is XQuery. So that also has XPath as a component, plus what I would call a full featured query language. So it's most similar to SQL in a way, as we'll be seeing. The order that we're going to cover them in is first XPath and then actually second XQuery and finally XSLT. There are a couple of other languages, XLink and XPointer. Those languages are for specifying, as you can see, links and pointers. They also use the XPath language as a component. We won't be covering those in this video. Now we'll be covering XPath, XQuery, and XSLT in moderate detail. We're not going to cover every single construct of the languages, but we will be covering enough to write a wide variety of queries using those languages. To understand how XPath works, it's good to think of the XML as a tree. So I'd like you to bear with me for a moment while I write a little bit of a tree that would be the tree encoding of the book store data that we've been working with. So we would write as our root the book store element, and then we'll have sub-elements that would contain the books that are the sub elements of our bookstore. We might have another book. We might have over here a magazine and within the books then we had, as you might remember some attributes and some sub elements. We had for example the ISBN number I'll write as an attribute here. We had a price and we also had of course the title of the book and we had the author, excuse me, over here, I'm obviously not going to be filling in the subelement structure here we are just going to look at one book as an example. The ISBN number we now are at the leaf of the tree so we could have a string value here to denote the leaf maybe, 100 for the price, for the title: "A First Course in Database Systems", then our authors had further sub-elements. We had maybe two authors' sub elements here, I'm abbreviating a bit, below here, a first name and a last name, again abbreviating so that might have been Jeff Ullman, and so on. I think you get the idea of how we render our X and L as a tree. And the reason we're doing that is so that we can think of the expressions we have in XPath as navigations down the tree. Specifically, what XML consists of is path expressions that describe navigation down and sometimes across and up a tree. And then we also have conditions that we evaluate to pick out the components of the XML that we're interested in. So let me just go through a few of the basic constructs that we have in XPath. Let me just erase a few of these things here that got in my way. Okay. I'm gonna use this little box and I'm gonna put the construct in and then sort of explain how it works. So the first construct is simply a slash, and the slash is for designating the root element. So we'll put the slash at the beginning of an XPath query to say we want to start at the root. A slash is also used as a separator. So we're going to write paths that are going to navigate down the tree and we're going to put a '/' between the elements of the path. All of this will become much clearer in the demo. So I'll try to go fairly quickly now so we can move to the demo itself. The next construct is simply writing the name of an element. I put 'x' here but we might for example write 'book'. When we write 'book' in an X path expression, we're saying that we want to navigate say we're up here at the bookstore down to the book sub-element as part of our path expression. We can also write the special element symbol '' and '' matches anything. So if we write '/' then we'll match any sub-element of our current element. When we execute X path, there's sort of a notion as we're writing the path expressions of being at a particular place. So we might have navigated from bookstore to book and then we would navigate say further down to title or if we put a '' then we navigate to any sub-element. If we want to match an attribute, we write '@' and then the attribute name. So for example, if we're at the book and we want to match down to the ISBN number, we'll write ISBN in our query, our path expression. We saw the single slash for navigating one step down. There's also a double slash construct. The double slash matches any descendant of our current element. So, for example, if we're here at the book and we write double slash, we'll match the title, the authors, the off, the first name and the last name, every descendant, and actually we'll also match ourselves. So this symbol here means any descendant, including the element where we currently are. So now I've given a flavor of how we write path expressions. Again, we'll see lots of them in our demo. What about conditions? If we want to evaluate a condition at the current point in the path, we put it in a square bracket and we write the condition here. So, for example, if we wanted our price to be less than 50, that would be a condition we could put in square brackets if we were (actually, better be the attribute) at this point in the navigation. Now we shouldn't confuse putting a condition in a square bracket with putting a number in a square bracket. If we put a number in a square bracket, N, for example, if I write three, that is not a condition but rather it matches the Nth sub element of the current element. For example, if we were here at authors and we put off square bracket two, then we would match the second off sub element of the authors. There are many, many other constructs. This just gives the basic flavor of the constructs for creating path expressions and evaluating conditions. XPath also has lots of built in functions. I'll just mention two of them as somewhat random examples. There's a function that you can use in XPath called contains. If you write contains and then you write two expressions, each of which has a string value - this is actually a predicate - will return true, if the first string contains the second string. As a second example of a function, there's a function called name. If we write name in a path, that returns the tag of the current element in the path. We'll see the use of functions in our demo. The last concept that I want to talk about is what's known as navigation axes, and there's 13 axes in XPath. And what an axis is, it's sort of a key word that allows us to navigate around the XML tree. So, for example, one axis is called parent. You might have noticed that when we talked about the basic constructs, most of them were about going down a tree. If you want to navigate up the tree, then you can use the parent access that tells you to go up to the parent. There's an access called following sibling. And the colon colon - you'll see how that works when we get to the demo. The following sibling says match actually all of the following siblings of the current element. So if we have a tree and we're sitting at this point in the tree, then we...the following sibling axis will match all of the siblings that are after the current one in the tree. There's an axis called descendants descendants, as you might guess, matches all the descendants of the current element. Now it's not quite the same as slash, slash, because as a reminder, slash, slash also matches the current element as well as the descendants. Actually as it happens, there is a navigation access called descendants and self that' s equivalent to slash, slash. And by the way, there's also one called self that will match the current element. And that may not seem to be useful, but well see uses for that, for example, in conjunction with the name function that we talked about up here, that would give us the tag of the current element. Just a few details to wrap up. XPath queries technically operate on and return a sequence of elements. That's their formal semantics. There is a specification for how XML documents and XML streams map to sequences of elements and you'll see that it's quite natural. When we run an XPath query, sometimes the result can be expressed as XML, but not always. But as we'll see again, that's fairly natural as well. So this video has given an introduction to XPath. We've shown how to think of XML data as a tree and then XPath as expressions that navigate around the tree and also evaluate conditions. We've seen a few of the constructs for path expressions or conditions. We've seen a couple of built-in functions and I've introduced the concept of navigation axes. But the real way to learn and understand XPath is to run some queries. So I urge you to watch the next video which is a demo of XPath queries over our bookstore data and then try some queries yourself.