Prediction, decision, control, lots of
techniques, many different ways of
stringing them together, deciding which
techniques to use sounds very complicated.
And, thinking about how to implement a
system like a global traffic management
system with lots of self driving cars
certainly is extremely complex.
But, there's actually a machine which is
doing many more complex things everyday
and there are about seven billion of them
on this planet already. Obviously, I'm
talking about the human brain.
Not only does it do all the tasks that we
have discussed, discussed in this course,
it does them with a fairly uniform
architecture.
So, the plasticity or uniform architecture
of the brain look something like this.
I mean, there are a whole bunch of
neurons,
About a hundred billion neurons in the
brain.
And, if you look at the top most layer of
the brain, a few millimeters thick,
It's called the, the neocortex.
Which is believed responsible for much of
our conscious thought.
The way it's organized is, is in bunches
of neurons which are arranged in columns,
so there's a vertical arrangement of
neurons.
And these neurons have connections
vertically as well as horizontally across
columns.
That's very important,
That's one element of that structure.
Each neuron, on the other hand,
Looks something like this.
The most of the white matter in one's
brain is consisting of the connections
between the neurons.
The neuron body is actually a small
fraction of that of the gray matter which
is there in the brain.
And the connections between neurons look
something like this.
These are called dendrites, and this is
the axon.
And, a dendrite connects to other neurons
via synapses.
Synapses we've learned in the past decade
form very rapidly between neurons that are
dendrites which are close to, dendrites
and axons which are close to each other,
and can, can, can form in minutes and can,
can decay over time, and very rapidly as
well. So, there's another important
feature of, of, of the brain.
An abstract model of the neuron which is
what is used by hierarchical temporal
memory, which is what we're going to talk
about, which is a model of the brain
profounded by Jeff Hawkins. and he talks
about this in a recent talk at the
International Symposium on, on Computer
Architecture, funny enough just in June
this year.
So, the model that he uses is an abstract
model of the neuron which looks lot like a
neural element in a neural network.
However, it has some very important
differences and we'll explain these, this
structure in the next few minutes as we go
along.
The first important feature of
hierarchical temporal memory is that it
relies on sparse representations.
In particular, sparse distributed
representations.
These are closely related to the sparse
distributed memory that we discussed way
back during the Locke lecture.
Remember, the properties of very long bit
sequences of zeros and ones.
In particular, if we have patterns of
thousand bits in length, we, we learned
that there is a very low chance that
patterns defer in less than 450 places.
So, most of them are far apart in this
sense, most random patterns chosen at
random are far apart.
Now, consider special types of patterns.
This time we'll take 2,000 bits because
that's the example that Jeff Hawkins uses
in his lecture.
So, we have pattern of 2,000 bits but it's
forced to have only 40 ones.
So, only two percent of these bits are
ones,
And we forced that in a particular way,
which we'll describe shortly.
But, if we do this, then lets see what
happens.
There's a very low chance of a random
sparse pattern, that means another random
pattern with only 40 ones, matching any of
these ones.
So, matching a significant number of these
40 ones.
Just imagine, if, if twenty of these ones
have to match then the chance is one by
2,000 multiplied by itself twenty times.
Even if we drop all the ten random
positions out of these 40, right?
Let's just, for example, we have a pattern
which has 40 ones,
But we decided to use, to retain only ten
of them at random.
And think about another sparse pattern of
40 ones to start with,
The chance that, after dropping ten, all
but ten from that pattern and all but ten
from our first pattern, the chance that
any of these ten match is again, very
small,
Right? Again, it's the one by 2,000
multiplied by itself many times and you
can work that out.
Please try to work it out because I, I
might even ask a question on this at some
point.
It's fairly simple arithmetic, not even as
complex as what we did for the zero one
sequences in sparse memory.
Now, this a, this particular feature is
exploited by hierarchical temporal memory
in the following way.
If this con, if you consider the bunch of
neurons in the sheet of neurons in the
brain, say, or models of the brain,
And say these neurons are being activated
by, by, by, by light intensity in the,,
following in a particular pattern, right?
Then, obviously, a light intensity pattern
won't just have two percent of it bright.
It will be many more pixels will be bright
in this kind of an image.
But, what this sparse representation does
is, it chooses the 40 brightest in the
same sense that it's not the 40 brightest
necessarily.
But, if, if a set of neurons is bright in
this area, they will be inhibited by
neurons which are bright nearby..
So, only one of these neurbons, neurons
out of these bunch which are bright, will
end up firing.
And by forcing neurons to turn off because
their neighborhood is also equally bright,
we get a sparse representation for a
fairly, otherwise fairly dense image.
Now, it's not clear why this is important
right now, but it'll become clearer, sort
of, in a few minutes.
What happens is that the similar scene,
Say that a similar scene this was a
particular pattern.
And you, you view a similar scene, it will
give this a very sparse pattern even after
sub sampling.
So, even after we get rid of 30 of the 40
ones, we'll still get a similar sparse
pattern from two different instances of
the same scene or even similar scenes.
Howeve, if you see completely different
scenes,
The chance that the sparse patterns that
we'll get match in a large number of
positions is very small.
And that's the very important part of
these kinds of representations,
That the key is the number of, of, of, of
base that we start with.
The fact that we choose only a small
number of them as ones, and randomly drop
some of them each time.
All these things put together make it very
unlikely that, that this similar scenes
will make the same sparse patterns.
But, similar scenes will very likely give
the same sparse patters, or sparse
patterns that match each other.