This video introduces JSON.
Let's start by talking about its pronunciation.
Some people call it Jason, and some call it J-sahn.
I'll do a little bit of
investigation and discovered that the
original developer of JSON calls
it JSON so, I'll do that too.
Like XML, JSON can be thought of as a data model.
An alternative to the relational data
model that is more
appropriate for semi-structured data.
In this video I'll introduce the
basics of JSON and I'll
actually compare JSON to the
relational data model and I'll compare it to XML.
But it's not crucial to have
watched those videos to get something out of this one.
Now among the three models
- the relational model, XML, and
JSON - JSON is by
a large margin the newest,
and it does show there aren't
as many tools for JSON
as we have for XML and
certainly not as we have for relational.
JSON stands for Javascript object notation.
Although it's evolved to become pretty
much independent of Javascript at this point.
The little snippet of Jason in the corner right now mostly for decoration.
We'll talk about the details in just a minute.
Now JSON was designed
originally for what's called
serializing data objects.
That is taking the objects that
are in a program and sort
of writing them down in a
serial fashion, typically in files.
one thing about json
is that it is human readable,
similar to the way xml
is human readable and is
often use for data interchange.
So, for writing out, say
the objects program so that
they can be exchanged with another
program and read into that one.
Also, just more generally, because
json is not as rigid
as the relational model, it's generally
useful for representing and for
storing data that doesn't
have rigid structure that we've been calling semi-structured data.
As I mentioned json is
no longer closely tied to
Many different programming languages do
have parsers for reading json
data into the program and
for writing out json data as well.
Now, let's talk about the basic
constructs in JSON, and as
we will see this constructs are recursively defined.
We'll use the example JSON
data shown on the screen
and that data is also available
in a file for download from the website.
The basic atomic values in JSON are fairly typical.
We have numbers, we have strings.
We also have Boolean Values
although there are none of those
in this example, that's true and false, and no values.
There are two types of composite
values in JSON: objects and arrays.
Objects are enclosed in curly
braces and they consist
of sets of label-value pairs.
For example, we have an
object here that has a first name and a last name.
We have a more -
bigger, let's say, object here
that has ISBN, price, edition, and so on.
When we do our JSON demo,
we'll go into these constructs in more detail.
At this point, we're just introducing them.
the second type of composite
value in JSON is arrays,
and arrays are enclosed in square
brackets with commas between the array elements.
Actually we have commas in the objects
as and arrays are list of values.
For example, we can see
here that authors is a
list of author objects.
Now I mentioned that the constructs
are recursive, specifically the values
inside arrays can be anything,
they can be other arrays or objects,
space values and the values
are making up the label value
pairs and objects can also
be any composite value or a base value.
And I did want to
mention, by the way, that sometime
this word label here for
label value pairs is called a "property".
So, just like XML, JSON
has some basic structural requirements in
its format but it doesn't
have a lot of requirements in terms of uniformity.
We have a couple of examples
of heterogeneity in here, for
example, this book has an
edition and the other one
doesn't this book has a remark and the other one doesn't.
But we'll see many more examples
of heterogeneity when we do
the demo and look into JSON data in more detail.
Now let's compare JSON and the relational model.
We will see that many of
the comparisons are fairly similar
to when we compared XML to the relational model.
Let's start with the basic structures underling the data model.
So, the relational model is based on tables.
We set up structure of
table, a set of columns, and
then the data becomes rows in those tables.
JSON is based instead on
sets, the sets of label
pairs and arrays and as we saw, they can be nested.
One of the big differences between
the two models, of course, is the scheme.
So the Relational model has a
Schema fixed in advance,
you set it up before you
have any data loaded and then
all data needs to confirm to that Schema.
Jason on the other other
hand typically does not require a schema in advance.
In fact, the schema and the
data are kinda mix together
just like an xml, and
this is often referred to as
self-describing data, where the
schema elements are within the data itself.
And this is of course typically
more flexible than the to a model.
But there are advantages to having schema [sp?]
as well, definitely.
As far as queries go, one
of the nice features of the
relational model is that there
are simple, expressive languages for clearing the database.
In terms of json, although a
few New things have been proposed;
at this point there's nothing widely
used for querying Jason data.
Typically Jason data is
read into a program and it's manipulated programatically.
Now let me interject that this
video is being made in February 2012.
So it is possible
that some json query languages
will emerge and become
widely used there is just
nothing used at this point.
There are some proposals.
There's a JSON path language,
JSON Query, a language called jaql.
It may be that just like
XML, the query language are
gonna follow the prevalent use
of the data format or the data model.
But that does not happened yet, as of February 2012.
How about ordering?
One aspect of the relational model is that it's an unordered model.
It's based on sets and
if we want to see relational
data in sorted order then we put that inside a query.
In JSON, we have arrays as
one of the basic data structures, and arrays are ordered.
Of course, there's also the fact like
XML that JSON data is
often is usually written files
and files themselves are naturally ordered,
but the ordering of the data
in files usually isn't relevant,
sometimes it is, but
typically not finally in
terms of implementation, for the
relational model, there are
systems that implement the relational model natively.
They're very generally quite
efficient and powerful systems.
For json, we haven't yet
seen stand alone database systems
that use json their data
model instead JSON is
more typically coupled with programming languages.
One thing I should add however
JSON is used in NoSQL systems.
We do have videos about NoSQL
systems you may or may not have, have watched those yet.
There's a couple of different ways that JSON is used used in those systems.
One of them is just as
a format for reading data
into the systems and writing data out from the systems.
The other way that it is
used is that some of the
note systems are what are
called "Document Management Systems" where
the documents themselves may contain
JSON data and then the systems
will have special features for manipulating
the JSON in the document is better stored by the system.
Now let's compared json and XML.
This is actually a hotly debated comparison right now.
There are signification overlap in
the usage of JSON and XML.
Both of them are very
good for putting semi-structured data
into a file format
and using it for data interchange.
And so because there's so
much overlap in what they're used
for, it's not surprising that there's significant debate.
I'm not gonna take sides.
I'm just going to try to give you a comparison.
Let's start by looking at the
verbosity of expressing data in the two languages.
So it is the case
that XML is in general,
a little more verbose than Jason.
So the same data expressed in
the 2 formats will tend to
have more characters [xx] than Json
and you can see that
in our examples because our big
Json example was actually pretty
much the same data that we used when we showed XML.
And the reason for
XML being a bit more
verbose largely has to
do actually with closing tags,
and some other features.
But I'll let you judge
for yourself whether the somewhat
longer expression of XML is a problem.
Second is complexity, and here,
too, most people would say
that XML is a bit more complex than JSON.
I'm not sure I entirely agree with that comparison.
If you look at the subset
of XML that people really
use, you've got attributes,
sub elements and text, and
that's more or less it.
If you look at Json, you got
your basic values and you've got your objects and your arrays.
I think the issue is that
XML has a lot of
extra stuff that goes along with it.
So if you read the entire XML specification.
It will take you a long time.
JSON, you can grasp the
entire specification a little bit more quickly.
Now let's turn to validity.
And by validity, I mean the
ability to specify constraints or
restriction or schema on
the structure of data
in one of these models, and
have it enforced by tools or by a system.
Specifically in XML we
have the notion of document type
descriptors, or DTDs, we also
have XML Schema which
gives us XSD's, XML Schema Descriptors.
And these are schema like
things that we can specify, and
we can have our data checked to
make sure it conforms to the
schema, and these are, I would say,
fairly widely used at this point for XML.
For JSON, there's something called JSON Schema.
And, you know, similar to
XML Schema, it's a way
to specify the structure and then
we can check that JSON conforms
that and we will see some of that in our demo.
The current status, February
2012 is that this is
not widely used this point.
But again, it could really just be evolution.
If we look back
at XML, as it was originally
proposed, probably we didn't
see a whole of lot of use
of DTDs, and in fact not
as XSDs for sure until later on.
So we'll just have to see whether JSON evolves in a similar way.
Now the programming interface is where JSON really shines.
The programming interface for XML can be fairly clunky.
The XML model, the attributes
and sub-elements and so on,
don't typically match the model
of data inside a programming language.
In fact, that's something called the impedance mismatch.
The impedance miss match
has been discussed in database
systems actually, for decades
because one of the original
criticisms of relational database
systems is that the data
structures used in the database,
specifically tables, didn't match
directly with the data structures and programming languages.
So there had to be some manipulation
at the interface between programming languages and the database system and that's the mismatch.
So that same impedance mismatch
is pretty much present
in XML wherein JSON is
really a more direct mapping
between many programming languages and the structures of JSON.
Finally, let's talk about querying.
I've already touched on this
a bit, but JSON does not
have any mature, widely
used query languages at this point.
for XML we do have
XPath, we have XQuery,
we have XSLT.
Maybe not all of
them are widely used but there's
no question that XPath at least and
XSL are used quiet a bit.
As far as Json goes there
is a proposal called Json path.
It looks actually quiet a lot
like XPath maybe he'll catch on.
There's something called JSON Query.
Doesn't look so much like
XML Query, I mean, XQuery.
and finally, there has been a
proposal called [xx] language, but
again as of February 2012
all of these are still very
early, so we just don't know what's going to catch on.
So now let's talk about the validity of JSON data.
So do JSON data that's
syntacti[xx]  valid, simply needs
to adhere to the basic structural requirements.
As a reminder, that would be
that we have sets of label
value pairs, we have arrays
of values and our values
are from predefined types.
And again, these values here are defined recursively.
So we start with a JSON
file and we send
it to a the parser
may determine that the file
has syntactic errors or if
the file is syntactically correct then
it can parsed into objects in a programming language.
Now if we're interested in semantically
valid JSON; that is
JSON that conforms to
some constraints or a schema,
then in addition to checking the
basics structural requirements, we check
whether JSON conforms to the specified schema.
If we use a language like JSON
schema for example, we put
a specification in as a
separate file, and in
fact JSON schema is expressed in
JSON itself, as we'll see
in our demo, we send it
to a validator and that
validator might find that there
are some syntactic errors or
it may find that there are
some symantic errors so the
data could to be correct syntactically
but not conform to the schema.
If it's both syntactically and semantically
correct then it can move
on to the parser where
will be parsed again into
objects in a programming language.
So to summarize, JSON stands for Java Script Object Notation.
It's a standard for taking data
objects and serializing them into a format that's human readable.
It's also very useful for
exchanging data between programs,
and for representing and storing
semi-structured data in a flexible fashion.
In the next video we'll go
live with a demonstration of JSON.
We'll use a couple of JSON
editors, we'll take a
look at the structure of JSON
data, when it's syntactically correct.
We'll demonstrate how it's very
flexible when our data might
irregular, and we'll also
demonstrate schema checking using
an example of JSON's schema.