1
00:00:00,000 --> 00:00:05,637
Prediction, decision, control, lots of
techniques, many different ways of

2
00:00:05,637 --> 00:00:12,161
stringing them together, deciding which
techniques to use sounds very complicated.

3
00:00:12,161 --> 00:00:18,442
And, thinking about how to implement a
system like a global traffic management

4
00:00:18,442 --> 00:00:24,000
system with lots of self driving cars
certainly is extremely complex.

5
00:00:24,940 --> 00:00:31,074
But, there's actually a machine which is
doing many more complex things everyday

6
00:00:31,074 --> 00:00:36,979
and there are about seven billion of them
on this planet already. Obviously, I'm

7
00:00:36,979 --> 00:00:42,761
talking about the human brain.
Not only does it do all the tasks that we

8
00:00:42,761 --> 00:00:49,144
have discussed, discussed in this course,
it does them with a fairly uniform

9
00:00:49,144 --> 00:00:53,952
architecture.
So, the plasticity or uniform architecture

10
00:00:53,952 --> 00:00:58,781
of the brain look something like this.
I mean, there are a whole bunch of

11
00:00:58,781 --> 00:01:01,769
neurons,
About a hundred billion neurons in the

12
00:01:01,769 --> 00:01:04,757
brain.
And, if you look at the top most layer of

13
00:01:04,757 --> 00:01:08,890
the brain, a few millimeters thick,
It's called the, the neocortex.

14
00:01:08,890 --> 00:01:13,150
Which is believed responsible for much of
our conscious thought.

15
00:01:13,150 --> 00:01:18,709
The way it's organized is, is in bunches
of neurons which are arranged in columns,

16
00:01:18,709 --> 00:01:21,797
so there's a vertical arrangement of
neurons.

17
00:01:21,797 --> 00:01:27,081
And these neurons have connections
vertically as well as horizontally across

18
00:01:27,081 --> 00:01:29,140
columns.
That's very important,

19
00:01:29,140 --> 00:01:36,410
That's one element of that structure.
Each neuron, on the other hand,

20
00:01:36,410 --> 00:01:40,230
Looks something like this.
The most of the white matter in one's

21
00:01:40,230 --> 00:01:43,752
brain is consisting of the connections
between the neurons.

22
00:01:43,752 --> 00:01:48,766
The neuron body is actually a small
fraction of that of the gray matter which

23
00:01:48,766 --> 00:01:53,407
is there in the brain.
And the connections between neurons look

24
00:01:53,407 --> 00:01:57,868
something like this.
These are called dendrites, and this is

25
00:01:57,868 --> 00:02:01,876
the axon.
And, a dendrite connects to other neurons

26
00:02:01,876 --> 00:02:06,484
via synapses.
Synapses we've learned in the past decade

27
00:02:06,697 --> 00:02:12,737
form very rapidly between neurons that are
dendrites which are close to, dendrites

28
00:02:12,737 --> 00:02:18,849
and axons which are close to each other,
and can, can, can form in minutes and can,

29
00:02:19,062 --> 00:02:24,321
can decay over time, and very rapidly as
well. So, there's another important

30
00:02:24,321 --> 00:02:30,256
feature of, of, of the brain.
An abstract model of the neuron which is

31
00:02:30,256 --> 00:02:35,327
what is used by hierarchical temporal
memory, which is what we're going to talk

32
00:02:35,327 --> 00:02:41,278
about, which is a model of the brain
profounded by Jeff Hawkins. and he talks

33
00:02:41,278 --> 00:02:46,737
about this in a recent talk at the
International Symposium on, on Computer

34
00:02:46,737 --> 00:02:50,885
Architecture, funny enough just in June
this year.

35
00:02:50,885 --> 00:02:56,854
So, the model that he uses is an abstract
model of the neuron which looks lot like a

36
00:02:56,854 --> 00:03:01,939
neural element in a neural network.
However, it has some very important

37
00:03:01,939 --> 00:03:07,905
differences and we'll explain these, this
structure in the next few minutes as we go

38
00:03:07,905 --> 00:03:10,357
along.
The first important feature of

39
00:03:10,357 --> 00:03:17,200
hierarchical temporal memory is that it
relies on sparse representations.

40
00:03:17,920 --> 00:03:22,353
In particular, sparse distributed
representations.

41
00:03:22,353 --> 00:03:29,683
These are closely related to the sparse
distributed memory that we discussed way

42
00:03:29,683 --> 00:03:36,107
back during the Locke lecture.
Remember, the properties of very long bit

43
00:03:36,107 --> 00:03:41,717
sequences of zeros and ones.
In particular, if we have patterns of

44
00:03:41,717 --> 00:03:47,871
thousand bits in length, we, we learned
that there is a very low chance that

45
00:03:47,871 --> 00:03:55,835
patterns defer in less than 450 places.
So, most of them are far apart in this

46
00:03:55,835 --> 00:04:01,480
sense, most random patterns chosen at
random are far apart.

47
00:04:02,040 --> 00:04:09,066
Now, consider special types of patterns.
This time we'll take 2,000 bits because

48
00:04:09,066 --> 00:04:14,135
that's the example that Jeff Hawkins uses
in his lecture.

49
00:04:14,135 --> 00:04:20,450
So, we have pattern of 2,000 bits but it's
forced to have only 40 ones.

50
00:04:20,450 --> 00:04:23,652
So, only two percent of these bits are
ones,

51
00:04:23,652 --> 00:04:29,700
And we forced that in a particular way,
which we'll describe shortly.

52
00:04:31,040 --> 00:04:34,402
But, if we do this, then lets see what
happens.

53
00:04:34,402 --> 00:04:40,305
There's a very low chance of a random
sparse pattern, that means another random

54
00:04:40,305 --> 00:04:44,115
pattern with only 40 ones, matching any of
these ones.

55
00:04:44,115 --> 00:04:47,552
So, matching a significant number of these
40 ones.

56
00:04:47,552 --> 00:04:53,231
Just imagine, if, if twenty of these ones
have to match then the chance is one by

57
00:04:53,231 --> 00:05:01,842
2,000 multiplied by itself twenty times.
Even if we drop all the ten random

58
00:05:01,842 --> 00:05:07,769
positions out of these 40, right?
Let's just, for example, we have a pattern

59
00:05:07,769 --> 00:05:12,398
which has 40 ones,
But we decided to use, to retain only ten

60
00:05:12,398 --> 00:05:17,954
of them at random.
And think about another sparse pattern of

61
00:05:17,954 --> 00:05:23,376
40 ones to start with,
The chance that, after dropping ten, all

62
00:05:23,376 --> 00:05:30,694
but ten from that pattern and all but ten
from our first pattern, the chance that

63
00:05:30,694 --> 00:05:34,580
any of these ten match is again, very
small,

64
00:05:34,580 --> 00:05:39,426
Right? Again, it's the one by 2,000
multiplied by itself many times and you

65
00:05:39,426 --> 00:05:43,186
can work that out.
Please try to work it out because I, I

66
00:05:43,186 --> 00:05:46,409
might even ask a question on this at some
point.

67
00:05:46,409 --> 00:05:52,318
It's fairly simple arithmetic, not even as
complex as what we did for the zero one

68
00:05:52,519 --> 00:05:57,925
sequences in sparse memory.
Now, this a, this particular feature is

69
00:05:57,925 --> 00:06:01,940
exploited by hierarchical temporal memory
in the following way.

70
00:06:03,660 --> 00:06:11,640
If this con, if you consider the bunch of
neurons in the sheet of neurons in the

71
00:06:11,640 --> 00:06:18,556
brain, say, or models of the brain,
And say these neurons are being activated

72
00:06:18,556 --> 00:06:26,536
by, by, by, by light intensity in the,,
following in a particular pattern, right?

73
00:06:26,536 --> 00:06:32,843
Then, obviously, a light intensity pattern
won't just have two percent of it bright.

74
00:06:32,843 --> 00:06:38,660
It will be many more pixels will be bright
in this kind of an image.

75
00:06:38,660 --> 00:06:45,263
But, what this sparse representation does
is, it chooses the 40 brightest in the

76
00:06:45,263 --> 00:06:49,507
same sense that it's not the 40 brightest
necessarily.

77
00:06:49,507 --> 00:06:55,445
But, if, if a set of neurons is bright in
this area, they will be inhibited by

78
00:06:55,445 --> 00:07:00,830
neurons which are bright nearby..
So, only one of these neurbons, neurons

79
00:07:00,830 --> 00:07:05,001
out of these bunch which are bright, will
end up firing.

80
00:07:05,001 --> 00:07:11,676
And by forcing neurons to turn off because
their neighborhood is also equally bright,

81
00:07:11,676 --> 00:07:17,440
we get a sparse representation for a
fairly, otherwise fairly dense image.

82
00:07:17,860 --> 00:07:23,901
Now, it's not clear why this is important
right now, but it'll become clearer, sort

83
00:07:23,901 --> 00:07:29,540
of, in a few minutes.
What happens is that the similar scene,

84
00:07:29,540 --> 00:07:34,523
Say that a similar scene this was a
particular pattern.

85
00:07:34,781 --> 00:07:41,569
And you, you view a similar scene, it will
give this a very sparse pattern even after

86
00:07:41,569 --> 00:07:46,467
sub sampling.
So, even after we get rid of 30 of the 40

87
00:07:46,467 --> 00:07:53,342
ones, we'll still get a similar sparse
pattern from two different instances of

88
00:07:53,342 --> 00:07:59,603
the same scene or even similar scenes.
Howeve, if you see completely different

89
00:07:59,603 --> 00:08:02,719
scenes,
The chance that the sparse patterns that

90
00:08:02,719 --> 00:08:06,680
we'll get match in a large number of
positions is very small.

91
00:08:06,680 --> 00:08:11,775
And that's the very important part of
these kinds of representations,

92
00:08:11,775 --> 00:08:15,394
That the key is the number of, of, of, of
base that we start with.

93
00:08:15,394 --> 00:08:21,155
The fact that we choose only a small
number of them as ones, and randomly drop

94
00:08:21,155 --> 00:08:26,029
some of them each time.
All these things put together make it very

95
00:08:26,029 --> 00:08:31,051
unlikely that, that this similar scenes
will make the same sparse patterns.

96
00:08:31,051 --> 00:08:36,590
But, similar scenes will very likely give
the same sparse patters, or sparse

97
00:08:36,590 --> 00:08:38,880
patterns that match each other.