This video introduces JSON. Let's start by talking about its pronunciation. Some people call it Jason, and some call it J-sahn. I'll do a little bit of investigation and discovered that the original developer of JSON calls it JSON so, I'll do that too. Like XML, JSON can be thought of as a data model. An alternative to the relational data model that is more appropriate for semi-structured data. In this video I'll introduce the basics of JSON and I'll actually compare JSON to the relational data model and I'll compare it to XML. But it's not crucial to have watched those videos to get something out of this one. Now among the three models - the relational model, XML, and JSON - JSON is by a large margin the newest, and it does show there aren't as many tools for JSON as we have for XML and certainly not as we have for relational. JSON stands for Javascript object notation. Although it's evolved to become pretty much independent of Javascript at this point. The little snippet of Jason in the corner right now mostly for decoration. We'll talk about the details in just a minute. Now JSON was designed originally for what's called serializing data objects. That is taking the objects that are in a program and sort of writing them down in a serial fashion, typically in files. one thing about json is that it is human readable, similar to the way xml is human readable and is often use for data interchange. So, for writing out, say the objects program so that they can be exchanged with another program and read into that one. Also, just more generally, because json is not as rigid as the relational model, it's generally useful for representing and for storing data that doesn't have rigid structure that we've been calling semi-structured data. As I mentioned json is no longer closely tied to Many different programming languages do have parsers for reading json data into the program and for writing out json data as well. Now, let's talk about the basic constructs in JSON, and as we will see this constructs are recursively defined. We'll use the example JSON data shown on the screen and that data is also available in a file for download from the website. The basic atomic values in JSON are fairly typical. We have numbers, we have strings. We also have Boolean Values although there are none of those in this example, that's true and false, and no values. There are two types of composite values in JSON: objects and arrays. Objects are enclosed in curly braces and they consist of sets of label-value pairs. For example, we have an object here that has a first name and a last name. We have a more - bigger, let's say, object here that has ISBN, price, edition, and so on. When we do our JSON demo, we'll go into these constructs in more detail. At this point, we're just introducing them. the second type of composite value in JSON is arrays, and arrays are enclosed in square brackets with commas between the array elements. Actually we have commas in the objects as and arrays are list of values. For example, we can see here that authors is a list of author objects. Now I mentioned that the constructs are recursive, specifically the values inside arrays can be anything, they can be other arrays or objects, space values and the values are making up the label value pairs and objects can also be any composite value or a base value. And I did want to mention, by the way, that sometime this word label here for label value pairs is called a "property". So, just like XML, JSON has some basic structural requirements in its format but it doesn't have a lot of requirements in terms of uniformity. We have a couple of examples of heterogeneity in here, for example, this book has an edition and the other one doesn't this book has a remark and the other one doesn't. But we'll see many more examples of heterogeneity when we do the demo and look into JSON data in more detail. Now let's compare JSON and the relational model. We will see that many of the comparisons are fairly similar to when we compared XML to the relational model. Let's start with the basic structures underling the data model. So, the relational model is based on tables. We set up structure of table, a set of columns, and then the data becomes rows in those tables. JSON is based instead on sets, the sets of label pairs and arrays and as we saw, they can be nested. One of the big differences between the two models, of course, is the scheme. So the Relational model has a Schema fixed in advance, you set it up before you have any data loaded and then all data needs to confirm to that Schema. Jason on the other other hand typically does not require a schema in advance. In fact, the schema and the data are kinda mix together just like an xml, and this is often referred to as self-describing data, where the schema elements are within the data itself. And this is of course typically more flexible than the to a model. But there are advantages to having schema [sp?] as well, definitely. As far as queries go, one of the nice features of the relational model is that there are simple, expressive languages for clearing the database. In terms of json, although a few New things have been proposed; at this point there's nothing widely used for querying Jason data. Typically Jason data is read into a program and it's manipulated programatically. Now let me interject that this video is being made in February 2012. So it is possible that some json query languages will emerge and become widely used there is just nothing used at this point. There are some proposals. There's a JSON path language, JSON Query, a language called jaql. It may be that just like XML, the query language are gonna follow the prevalent use of the data format or the data model. But that does not happened yet, as of February 2012. How about ordering? One aspect of the relational model is that it's an unordered model. It's based on sets and if we want to see relational data in sorted order then we put that inside a query. In JSON, we have arrays as one of the basic data structures, and arrays are ordered. Of course, there's also the fact like XML that JSON data is often is usually written files and files themselves are naturally ordered, but the ordering of the data in files usually isn't relevant, sometimes it is, but typically not finally in terms of implementation, for the relational model, there are systems that implement the relational model natively. They're very generally quite efficient and powerful systems. For json, we haven't yet seen stand alone database systems that use json their data model instead JSON is more typically coupled with programming languages. One thing I should add however JSON is used in NoSQL systems. We do have videos about NoSQL systems you may or may not have, have watched those yet. There's a couple of different ways that JSON is used used in those systems. One of them is just as a format for reading data into the systems and writing data out from the systems. The other way that it is used is that some of the note systems are what are called "Document Management Systems" where the documents themselves may contain JSON data and then the systems will have special features for manipulating the JSON in the document is better stored by the system. Now let's compared json and XML. This is actually a hotly debated comparison right now. There are signification overlap in the usage of JSON and XML. Both of them are very good for putting semi-structured data into a file format and using it for data interchange. And so because there's so much overlap in what they're used for, it's not surprising that there's significant debate. I'm not gonna take sides. I'm just going to try to give you a comparison. Let's start by looking at the verbosity of expressing data in the two languages. So it is the case that XML is in general, a little more verbose than Jason. So the same data expressed in the 2 formats will tend to have more characters [xx] than Json and you can see that in our examples because our big Json example was actually pretty much the same data that we used when we showed XML. And the reason for XML being a bit more verbose largely has to do actually with closing tags, and some other features. But I'll let you judge for yourself whether the somewhat longer expression of XML is a problem. Second is complexity, and here, too, most people would say that XML is a bit more complex than JSON. I'm not sure I entirely agree with that comparison. If you look at the subset of XML that people really use, you've got attributes, sub elements and text, and that's more or less it. If you look at Json, you got your basic values and you've got your objects and your arrays. I think the issue is that XML has a lot of extra stuff that goes along with it. So if you read the entire XML specification. It will take you a long time. JSON, you can grasp the entire specification a little bit more quickly. Now let's turn to validity. And by validity, I mean the ability to specify constraints or restriction or schema on the structure of data in one of these models, and have it enforced by tools or by a system. Specifically in XML we have the notion of document type descriptors, or DTDs, we also have XML Schema which gives us XSD's, XML Schema Descriptors. And these are schema like things that we can specify, and we can have our data checked to make sure it conforms to the schema, and these are, I would say, fairly widely used at this point for XML. For JSON, there's something called JSON Schema. And, you know, similar to XML Schema, it's a way to specify the structure and then we can check that JSON conforms that and we will see some of that in our demo. The current status, February 2012 is that this is not widely used this point. But again, it could really just be evolution. If we look back at XML, as it was originally proposed, probably we didn't see a whole of lot of use of DTDs, and in fact not as XSDs for sure until later on. So we'll just have to see whether JSON evolves in a similar way. Now the programming interface is where JSON really shines. The programming interface for XML can be fairly clunky. The XML model, the attributes and sub-elements and so on, don't typically match the model of data inside a programming language. In fact, that's something called the impedance mismatch. The impedance miss match has been discussed in database systems actually, for decades because one of the original criticisms of relational database systems is that the data structures used in the database, specifically tables, didn't match directly with the data structures and programming languages. So there had to be some manipulation at the interface between programming languages and the database system and that's the mismatch. So that same impedance mismatch is pretty much present in XML wherein JSON is really a more direct mapping between many programming languages and the structures of JSON. Finally, let's talk about querying. I've already touched on this a bit, but JSON does not have any mature, widely used query languages at this point. for XML we do have XPath, we have XQuery, we have XSLT. Maybe not all of them are widely used but there's no question that XPath at least and XSL are used quiet a bit. As far as Json goes there is a proposal called Json path. It looks actually quiet a lot like XPath maybe he'll catch on. There's something called JSON Query. Doesn't look so much like XML Query, I mean, XQuery. and finally, there has been a proposal called [xx] language, but again as of February 2012 all of these are still very early, so we just don't know what's going to catch on. So now let's talk about the validity of JSON data. So do JSON data that's syntacti[xx] valid, simply needs to adhere to the basic structural requirements. As a reminder, that would be that we have sets of label value pairs, we have arrays of values and our values are from predefined types. And again, these values here are defined recursively. So we start with a JSON file and we send it to a the parser may determine that the file has syntactic errors or if the file is syntactically correct then it can parsed into objects in a programming language. Now if we're interested in semantically valid JSON; that is JSON that conforms to some constraints or a schema, then in addition to checking the basics structural requirements, we check whether JSON conforms to the specified schema. If we use a language like JSON schema for example, we put a specification in as a separate file, and in fact JSON schema is expressed in JSON itself, as we'll see in our demo, we send it to a validator and that validator might find that there are some syntactic errors or it may find that there are some symantic errors so the data could to be correct syntactically but not conform to the schema. If it's both syntactically and semantically correct then it can move on to the parser where will be parsed again into objects in a programming language. So to summarize, JSON stands for Java Script Object Notation. It's a standard for taking data objects and serializing them into a format that's human readable. It's also very useful for exchanging data between programs, and for representing and storing semi-structured data in a flexible fashion. In the next video we'll go live with a demonstration of JSON. We'll use a couple of JSON editors, we'll take a look at the structure of JSON data, when it's syntactically correct. We'll demonstrate how it's very flexible when our data might irregular, and we'll also demonstrate schema checking using an example of JSON's schema.