1
00:00:00,022 --> 00:00:04,692
[MUSIC]

2
00:00:04,692 --> 00:00:08,668
To wrap up this module, we're going to
talk about a way to perform even more

3
00:00:08,668 --> 00:00:12,843
efficient nearest neighbor search using
localitive sense of hashing using

4
00:00:12,843 --> 00:00:15,340
something that are called multiple tables.

5
00:00:15,340 --> 00:00:17,730
So these are multiple hash tables.

6
00:00:17,730 --> 00:00:21,300
And this is going to be advanced material.

7
00:00:21,300 --> 00:00:23,390
I actually think you
could watch the video and

8
00:00:23,390 --> 00:00:28,320
get out the high level concept of what's
going on and the importance of it.

9
00:00:28,320 --> 00:00:33,000
But it's going to include some content
that's quite advanced in terms of

10
00:00:33,000 --> 00:00:36,320
looking at probabilities
of different events and

11
00:00:36,320 --> 00:00:39,540
assessing the relative
probabilities of different setups.

12
00:00:39,540 --> 00:00:43,790
And so we're putting this
as an optional video, but

13
00:00:43,790 --> 00:00:47,670
I would encourage you to at least look at
it and think through these different ideas

14
00:00:47,670 --> 00:00:50,910
even if you don't follow all the details
mathematically of what's going on here.

15
00:00:52,610 --> 00:00:56,880
Okay, so let's look at a specific
example of locality sense of hashing,

16
00:00:56,880 --> 00:00:59,780
where I'm just going to
throw down two lines.

17
00:00:59,780 --> 00:01:05,310
And for simplicity I'm going to assume
an algorithm where we search the bin

18
00:01:05,310 --> 00:01:10,510
that the query point falls into,
as well as just one bit off.

19
00:01:10,510 --> 00:01:16,200
So in the case of throwing down two lines,
we would search, for example,

20
00:01:16,200 --> 00:01:21,390
let's say that this is our query point,
here.

21
00:01:23,280 --> 00:01:29,110
And it has been index 0,0 then we
would also search bins 0,1 and 1,0.

22
00:01:29,110 --> 00:01:35,130
So these are cases where we've taken,
switching colors again.

23
00:01:37,413 --> 00:01:41,036
We've taken one bit and flipped it.

24
00:01:44,302 --> 00:01:46,520
And searched those two neighboring bins.

25
00:01:46,520 --> 00:01:52,220
And the other thing we're going to assume
is that, we're just going to call delta

26
00:01:52,220 --> 00:01:59,170
the probability of putting a line between
two points that are theta angles apart.

27
00:01:59,170 --> 00:02:00,789
So if had two points.

28
00:02:02,750 --> 00:02:09,386
Living in this space and
they were theta radians apart,

29
00:02:09,386 --> 00:02:14,469
then we're going to say
that the probability

30
00:02:14,469 --> 00:02:20,553
that a randomly placed
line divides these points.

31
00:02:20,553 --> 00:02:22,412
So this happens

32
00:02:25,351 --> 00:02:29,858
With probability delta.

33
00:02:29,858 --> 00:02:32,229
And so we can think through what delta is.

34
00:02:32,229 --> 00:02:37,906
So if this line is just
uniformly placed between,

35
00:02:37,906 --> 00:02:40,885
let's say, zero and pi.

36
00:02:40,885 --> 00:02:44,734
So once we get to the other side we
start wrapping around if you think about

37
00:02:44,734 --> 00:02:45,740
visualizing it.

38
00:02:45,740 --> 00:02:50,790
But you can say that the probability
of falling within some little region

39
00:02:50,790 --> 00:02:53,910
theta is just theta divided by pi.

40
00:02:55,660 --> 00:02:58,150
So that would be one example
of this probability delta but

41
00:02:58,150 --> 00:03:05,830
let's just generically say that we have
probability delta falling in this range.

42
00:03:05,830 --> 00:03:10,260
Okay, then what we're saying
here is that in this case

43
00:03:10,260 --> 00:03:11,922
we're going to search three bins.

44
00:03:11,922 --> 00:03:14,030
We're going to search our query bin and

45
00:03:14,030 --> 00:03:16,640
the two neighboring bins where
we flipped just one bit.

46
00:03:18,000 --> 00:03:22,299
And the chance that we're not
going to find a nearest neighbor,

47
00:03:22,299 --> 00:03:26,170
within these three bins is, delta squared.

48
00:03:26,170 --> 00:03:29,570
Because it's the chance that two lines,

49
00:03:29,570 --> 00:03:33,940
not just one line, divide our query
point from its nearest neighbor.

50
00:03:33,940 --> 00:03:38,100
And, what's the chance of getting
two lines within a region theta?

51
00:03:39,360 --> 00:03:42,210
Well, these two lines were
drawn independently and

52
00:03:42,210 --> 00:03:48,150
each had probability delta of
falling between in that range,

53
00:03:48,150 --> 00:03:52,350
that theta range, so we have delta
times delta which is delta squared.

54
00:03:53,900 --> 00:03:57,495
But what would happen if I just repeated
this process multiple times of throwing

55
00:03:57,495 --> 00:03:59,190
down two lines at random?

56
00:03:59,190 --> 00:04:03,363
So here's the example that we showed
before where we threw down our two lines.

57
00:04:03,363 --> 00:04:08,445
And here I'm using L0, L1,
L2, L3 just generically

58
00:04:08,445 --> 00:04:15,306
to refer to the list of data points
that fall into this Bin 0, 1, 2, and 3.

59
00:04:15,306 --> 00:04:19,030
Okay so now let's say we throw
down two other random lines.

60
00:04:19,030 --> 00:04:23,570
So it's these two lines and that creates
a different partitioning of our data.

61
00:04:23,570 --> 00:04:27,890
And then we throw down two more lines and
we get a different partitioning.

62
00:04:27,890 --> 00:04:32,770
And so what we've done is we've ended
up with three different hash tables

63
00:04:32,770 --> 00:04:35,450
if we repeat this process three times.

64
00:04:35,450 --> 00:04:43,570
And now let's make an assumption that
we're only going to search the bin

65
00:04:43,570 --> 00:04:47,940
in which our query point falls,
so in particular,

66
00:04:47,940 --> 00:04:53,440
here's our query point and
for this first case,

67
00:04:54,730 --> 00:05:00,155
we know that it has bin index 0,0 so,
we're going to search this bin 0.

68
00:05:01,260 --> 00:05:05,926
And in the second case,
this is our same query point here,

69
00:05:05,926 --> 00:05:10,976
has bin index 0,1, so
we're going to search the 0,1 bin and

70
00:05:10,976 --> 00:05:16,023
then, likewise, in this case,
it also has bin index 0,1,

71
00:05:16,023 --> 00:05:19,670
so again we're going to
search the 0,1 bin.

72
00:05:20,760 --> 00:05:23,200
So, I've constructed this example.

73
00:05:23,200 --> 00:05:25,070
I've created three
different hash tables and

74
00:05:25,070 --> 00:05:30,120
said we're only searching the bin
that the query point falls in, so

75
00:05:30,120 --> 00:05:34,800
that in this case we're
still searching three bins.

76
00:05:34,800 --> 00:05:37,461
Just like we did before when we
searched the query point and

77
00:05:37,461 --> 00:05:39,550
then the two neighboring bins.

78
00:05:39,550 --> 00:05:41,570
So we're still searching three bins.

79
00:05:41,570 --> 00:05:42,905
But the question is now,

80
00:05:42,905 --> 00:05:46,700
what's the probability that we're not
going to find our nearest neighbor.

81
00:05:46,700 --> 00:05:48,600
So let's work through this.

82
00:05:48,600 --> 00:05:53,860
So what this statement is equivalent to
is, what's the chance that the query point

83
00:05:53,860 --> 00:05:57,670
and the nearest neighbor
are in different bins,

84
00:05:57,670 --> 00:06:00,900
that they get split in all of
these different hash tables?

85
00:06:02,590 --> 00:06:06,090
So what's the probability that a line
falls between the query point and its

86
00:06:06,090 --> 00:06:10,630
nearest neighbor in this first example,
the second example and this third example.

87
00:06:11,705 --> 00:06:14,800
Well, we can work through this
mathematically or probabilistically.

88
00:06:16,710 --> 00:06:19,780
So in particular the probability
of this event if we look,

89
00:06:19,780 --> 00:06:23,940
let's just look at this for a setup here
to begin with, and the probability that

90
00:06:23,940 --> 00:06:28,920
the nearest neighbor's in a different bin
than the query point in this first example

91
00:06:28,920 --> 00:06:33,020
is one minus the probability that
they fell into the same bin.

92
00:06:34,300 --> 00:06:37,190
And what's the probability
that the query point and

93
00:06:37,190 --> 00:06:39,694
the nearest neighbor
fall into the same bin?

94
00:06:40,810 --> 00:06:44,541
Well, it's the probability that
neither the first line nor

95
00:06:44,541 --> 00:06:47,330
the second line separated
these two points.

96
00:06:48,410 --> 00:06:52,810
So what's the probability
that neither line or sorry,

97
00:06:52,810 --> 00:06:57,060
what's the probability that one
line does not split the two points?

98
00:06:57,060 --> 00:06:59,720
We know that delta is
the probability that splits it.

99
00:06:59,720 --> 00:07:04,180
So one minus delta is the probability
that there's no split from one line.

100
00:07:04,180 --> 00:07:05,960
But we have two lines.

101
00:07:05,960 --> 00:07:08,660
So, I mean just write this down,
so we have.

102
00:07:08,660 --> 00:07:16,392
One minus delta equals probability,
that one line.

103
00:07:18,265 --> 00:07:21,059
Does not split.

104
00:07:22,534 --> 00:07:27,172
Query and nearest neighbor.

105
00:07:27,172 --> 00:07:32,666
And then the reason we get
this squared is because

106
00:07:32,666 --> 00:07:37,759
these lines are thrown down independently,

107
00:07:37,759 --> 00:07:42,450
so one minus delta squared is probability

108
00:07:42,450 --> 00:07:47,292
that two lines don't split these points.

109
00:07:47,292 --> 00:07:52,180
So what we have here is we have one
minus the probability that neither

110
00:07:52,180 --> 00:07:55,629
line split our query and
its nearest neighbor.

111
00:07:55,629 --> 00:07:58,668
And if you rewrite this here,

112
00:07:58,668 --> 00:08:04,270
you can rewrite it as 2
delta- delta squared, okay.

113
00:08:06,440 --> 00:08:14,140
So now that was just the probability
of having our query point,

114
00:08:14,140 --> 00:08:18,140
and the nearest neighbor in different
bins for this one example here.

115
00:08:18,140 --> 00:08:22,070
Well, what's the probability that
that happens in all three cases?

116
00:08:22,070 --> 00:08:27,331
Well, again all three cases
are completely independent events and,

117
00:08:27,331 --> 00:08:31,788
so we can write the probability
as just raised to the power of

118
00:08:31,788 --> 00:08:34,748
the number of hash tables that we have.

119
00:08:37,344 --> 00:08:42,313
So we have this 2 delta- delta squared
probability of splitting in any one of

120
00:08:42,313 --> 00:08:46,780
these cases, and we have three
cases that we're going to look at.

121
00:08:46,780 --> 00:08:49,110
So the probability of
splitting in each of them

122
00:08:50,510 --> 00:08:54,910
is that probability we looked at
before raised to the third power.

123
00:08:56,260 --> 00:09:01,127
Okay, so to recap, if we looked
at three bins in one hash table,

124
00:09:01,127 --> 00:09:05,908
then the probability of not
finding our nearest neighbor after

125
00:09:05,908 --> 00:09:10,616
completing the search of those
three bins was delta squared.

126
00:09:10,616 --> 00:09:16,001
But if we defined three hash tables,
and only in each one searched one bin,

127
00:09:16,001 --> 00:09:19,920
so still a total of three
bins searched overall.

128
00:09:19,920 --> 00:09:25,140
The probability of not finding the nearest
neighbor is 2 delta- delta squared cubed.

129
00:09:26,370 --> 00:09:29,550
Okay, well it's a little
bit hard to compare

130
00:09:29,550 --> 00:09:32,390
these two functions just
written as they are.

131
00:09:32,390 --> 00:09:34,970
So I made a plot of them for you.

132
00:09:34,970 --> 00:09:38,816
And what, the X-axis is, is delta.

133
00:09:38,816 --> 00:09:43,670
It's the probability of a split of
points that are an angle theta apart.

134
00:09:45,138 --> 00:09:49,610
And the Y-axis is the probability
of not finding the nearest neighbor

135
00:09:49,610 --> 00:09:52,830
after performing the search of
three bins that I described.

136
00:09:52,830 --> 00:09:57,063
So again, let me just be clear
that this is under the scenario of

137
00:09:57,063 --> 00:10:03,607
throwing down two lines,

138
00:10:03,607 --> 00:10:11,530
searching three bins just to
make sure the setup is clear.

139
00:10:12,970 --> 00:10:18,271
Okay, and what we see is that here,
this pink case,

140
00:10:18,271 --> 00:10:23,576
let me switch colors,
maybe there's a pink color.

141
00:10:23,576 --> 00:10:26,640
No, nothing very pink
will just go with red.

142
00:10:26,640 --> 00:10:32,531
So this case here is the multiple

143
00:10:32,531 --> 00:10:36,895
hash table scenario and

144
00:10:36,895 --> 00:10:41,913
this blue line Is the single

145
00:10:41,913 --> 00:10:46,068
hash table scenario.

146
00:10:49,685 --> 00:10:55,167
And what we see is that we have
a lower probability of not finding

147
00:10:55,167 --> 00:11:00,030
our nearest neighbor when we
use multiple hash tables,

148
00:11:00,030 --> 00:11:05,840
still searching the same number of
bins in almost all of the cases.

149
00:11:05,840 --> 00:11:09,910
There is this crossover point,
somewhere around here,

150
00:11:12,120 --> 00:11:15,640
where one method, a single hash table,

151
00:11:15,640 --> 00:11:20,520
starts outperforming the multiple
hash table, but that's when the angle

152
00:11:20,520 --> 00:11:25,290
between our nearest neighbor, the query
and its nearest neighbor becomes larger.

153
00:11:25,290 --> 00:11:28,350
Well, we're going to
assume in a lot of cases

154
00:11:29,380 --> 00:11:33,730
that the distance between our query and
its nearest neighbor is small, and so

155
00:11:33,730 --> 00:11:37,529
in these cases we definitely get a benefit
from looking at multiple hash tables.

156
00:11:38,590 --> 00:11:42,960
But the benefit is even larger if
I think about throwing down more

157
00:11:44,570 --> 00:11:45,698
lines than just two.

158
00:11:45,698 --> 00:11:53,191
So let's restrict exactly what
the algorithm's going to be doing here.

159
00:11:53,191 --> 00:11:58,250
So in this case we're going to think
about throwing down some H lines,

160
00:11:58,250 --> 00:12:01,445
and then we're going to search, as before,

161
00:12:01,445 --> 00:12:07,710
the bin that the query point falls into,
as well as all bins that are one bit off.

162
00:12:07,710 --> 00:12:12,170
But remember when we have more
lines we have more bits, and so

163
00:12:12,170 --> 00:12:16,190
flipping just one bit there are going to
be more bins that we end up searching.

164
00:12:16,190 --> 00:12:20,980
So, in particular, even when we're still
searching bins that are just one bit

165
00:12:20,980 --> 00:12:25,058
off from the query,
we're going to have to look at

166
00:12:25,058 --> 00:12:32,650
the probability of our nearest
neighbor being more than one bit away.

167
00:12:32,650 --> 00:12:37,730
Okay, so, this is equal to 1
minus the probability that

168
00:12:37,730 --> 00:12:42,220
the points fall into this same bin,
because then we would of found them, minus

169
00:12:42,220 --> 00:12:47,670
the probability that they're more than
this 1 bin away search that we're doing.

170
00:12:49,440 --> 00:12:50,310
Okay, and so

171
00:12:50,310 --> 00:12:54,500
this is equivalent to saying it's 1 minus
the probability that there were no splits.

172
00:12:54,500 --> 00:12:59,104
So no line split are querying
its nearest neighbor

173
00:12:59,104 --> 00:13:03,489
minus the probability
that there was one line,

174
00:13:03,489 --> 00:13:07,778
exactly one line,
that split the two points.

175
00:13:07,778 --> 00:13:13,994
And so here what we have,

176
00:13:13,994 --> 00:13:19,027
so probability of no

177
00:13:19,027 --> 00:13:24,355
split is 1- delta,

178
00:13:24,355 --> 00:13:29,979
and we have h lines that

179
00:13:29,979 --> 00:13:35,913
we're throwing down.

180
00:13:35,913 --> 00:13:43,111
So this implies the probability

181
00:13:43,111 --> 00:13:48,708
that none of h lines split

182
00:13:48,708 --> 00:13:54,873
is (1- delta) to the h.

183
00:13:54,873 --> 00:13:57,590
Okay, now,
let's talk about this other one,

184
00:13:57,590 --> 00:13:59,490
which is a little bit more complicated.

185
00:13:59,490 --> 00:14:03,920
because we need to compute what's
the probability that a single line,

186
00:14:03,920 --> 00:14:08,930
only a single line, splits our query and
its nearest neighbor.

187
00:14:08,930 --> 00:14:14,740
Not two lines or three lines or
four lines or each lines, just one.

188
00:14:14,740 --> 00:14:17,110
So that's the probability.

189
00:14:17,110 --> 00:14:24,037
So here delta is the probability
of one line splitting.

190
00:14:27,670 --> 00:14:33,417
And then here, We have

191
00:14:33,417 --> 00:14:38,316
the probability of h-1

192
00:14:38,316 --> 00:14:44,440
lines, Not splitting the points.

193
00:14:47,390 --> 00:14:54,160
But there are h different ways we can have
just one line and not the other lines.

194
00:14:54,160 --> 00:14:56,600
So that's why we have to
multiply by the h here.

195
00:14:59,280 --> 00:15:05,793
So in summary, we're going to search h + 1
bins, and not find the nearest neighbor,

196
00:15:05,793 --> 00:15:12,340
with probability 1- (1- delta) to
the h- h delta(1 -delta) to the h- 1.

197
00:15:13,590 --> 00:15:18,100
Okay, well now let's consider the
alternative using multiple hash tables.

198
00:15:18,100 --> 00:15:23,173
So again, just to keep the number
of comparisons the same,

199
00:15:23,173 --> 00:15:27,535
remember that if we throw
down h different lines and

200
00:15:27,535 --> 00:15:32,808
we search the query bin, and
all the bins that are one bit off,

201
00:15:32,808 --> 00:15:36,479
we're going to end up
searching h + 1 bins.

202
00:15:36,479 --> 00:15:41,968
Well now, when we're thinking
about multiple hash tables again

203
00:15:41,968 --> 00:15:47,752
to make it as much apples and
apples type of comparison that we can do,

204
00:15:47,752 --> 00:15:53,270
we're going to think about defining
h + 1 different hash tables.

205
00:15:53,270 --> 00:15:58,553
And then just searching the query bin
in each of these different tables.

206
00:15:58,553 --> 00:16:03,882
Okay and so then what we're going to
calculate is what's the probability that

207
00:16:03,882 --> 00:16:09,920
my nearest neighbor is in a different
bin in all h+1 of these hash tables.

208
00:16:09,920 --> 00:16:15,970
Well, it's 1 minus the probability
that the query point and

209
00:16:15,970 --> 00:16:22,800
the nearest neighbor are in the same
bin for a given hash table but

210
00:16:22,800 --> 00:16:28,460
then we have that to the h+1 because we're
looking at h+1 different hash tables.

211
00:16:28,460 --> 00:16:33,279
So it's 1 minus
the probability that there's

212
00:16:33,279 --> 00:16:38,096
no split line and
repeating that h+1 times and

213
00:16:38,096 --> 00:16:42,710
so that's 1- delta to the h, to the h+1.

214
00:16:42,710 --> 00:16:48,510
Okay just to be very clear

215
00:16:48,510 --> 00:16:57,075
this is probability of no split in any

216
00:17:00,794 --> 00:17:05,187
I'm sorry, split from any of the h

217
00:17:05,187 --> 00:17:09,892
different lines that we threw down.

218
00:17:09,892 --> 00:17:14,361
So 1 minus that is
the probability that they fall

219
00:17:14,361 --> 00:17:18,503
into separate bins rather
than the same bin.

220
00:17:18,503 --> 00:17:21,430
And we need this to hold for
all hash tables.

221
00:17:25,430 --> 00:17:27,880
I want to say this one other way in case

222
00:17:29,160 --> 00:17:30,900
the way I was presenting
that wasn't clear.

223
00:17:32,620 --> 00:17:37,649
So in particular,
what we're computing here per hash table

224
00:17:37,649 --> 00:17:42,776
is what's the probability that 1 or
2 or 3 or all the way up to

225
00:17:42,776 --> 00:17:48,416
h different lines split are query
point from its nearest neighbor.

226
00:17:48,416 --> 00:17:53,347
And one simple way of calculating that is
just 1 minus the complement of that event

227
00:17:53,347 --> 00:17:56,348
which is the probability
that there are no lines,

228
00:17:56,348 --> 00:18:00,013
none of the h lines that we throw
down split these two points.

229
00:18:00,013 --> 00:18:04,582
Okay, and so that's what we have to
calculate per hash table and we just

230
00:18:04,582 --> 00:18:09,992
multiply those together to get the overall
probability that this doesn't happen,

231
00:18:09,992 --> 00:18:15,930
or rather, that there is a split of points
in each one of our different tables.

232
00:18:15,930 --> 00:18:20,580
Okay, so now let's summarize in
this slightly more generic setting.

233
00:18:20,580 --> 00:18:23,790
Where we're going to assume that
we're throwing down h lines.

234
00:18:26,500 --> 00:18:29,870
And in the first case we're
going to search h+1 bins and

235
00:18:29,870 --> 00:18:33,000
the probability of not finding
a nearest neighbor is written here.

236
00:18:33,000 --> 00:18:35,650
I'm not going to repeat that equation.

237
00:18:35,650 --> 00:18:41,370
And in the case of boltable hash tables in
particular are looking at h+1 different

238
00:18:41,370 --> 00:18:45,330
hash tables, but always searching just
the bin where the query point falls in

239
00:18:45,330 --> 00:18:48,720
where you have this probability of
not finding the nearest neighbor.

240
00:18:48,720 --> 00:18:51,590
Again, not very straightforward to
think about comparing these two

241
00:18:51,590 --> 00:18:52,320
different equations.

242
00:18:52,320 --> 00:18:54,280
So let's look at some plots.

243
00:18:54,280 --> 00:18:57,170
And we're going to look at plots
in two different scenarios.

244
00:18:57,170 --> 00:19:02,880
One is for throwing down three lines, the
other one is for throwing down ten lines.

245
00:19:02,880 --> 00:19:06,960
And, what we see is here, this top case is

246
00:19:06,960 --> 00:19:12,400
the one hash table scenario.

247
00:19:12,400 --> 00:19:17,506
And this future line is that

248
00:19:17,506 --> 00:19:23,507
multiple hash table scenario.

249
00:19:23,507 --> 00:19:29,835
And what we see in these plots is that now
there's a much larger difference between

250
00:19:29,835 --> 00:19:35,900
the benefits that we have using multiple
hash tables over a single hash table.

251
00:19:35,900 --> 00:19:40,396
And that's especially true as we
increase the number of lines that we

252
00:19:40,396 --> 00:19:41,720
are throwing down.

253
00:19:42,805 --> 00:19:47,110
And I want to mention that the set up
that I created here was really for

254
00:19:47,110 --> 00:19:51,750
the sake of comparing looking at one hash
table versus multiple hash tables but

255
00:19:51,750 --> 00:19:56,770
in general the way people really use
this in practice is you don't have to

256
00:19:56,770 --> 00:20:01,920
create exactly as many hash
tables as lines thrown down.

257
00:20:01,920 --> 00:20:03,710
You can actually create many,

258
00:20:03,710 --> 00:20:07,760
many more hash tables than
the number of lines you throw down.

259
00:20:07,760 --> 00:20:12,843
So in particular, you can think of
keeping the number of bits fixed,

260
00:20:12,843 --> 00:20:16,173
the number of lines you throw down fixed,
and

261
00:20:16,173 --> 00:20:19,964
think about increasing
the number of hash tables.

262
00:20:19,964 --> 00:20:25,168
So we're going to call m,
the number of hash tables.

263
00:20:25,168 --> 00:20:30,941
And h, here is the number of lines or

264
00:20:30,941 --> 00:20:33,640
number of bits.

265
00:20:35,650 --> 00:20:39,830
And so people sometimes refer to this
as the width and the depth of a table.

266
00:20:39,830 --> 00:20:41,515
So the width being the number of bits.

267
00:20:41,515 --> 00:20:45,300
The depth being the number of hash
tables that we're looking at.

268
00:20:45,300 --> 00:20:48,200
And what you can show
is that the probability

269
00:20:48,200 --> 00:20:50,820
of not finding the nearest neighbor.

270
00:20:50,820 --> 00:20:54,560
So, the probability that the query point
in the nearest neighbor in different bins

271
00:20:54,560 --> 00:21:00,980
in all of these m different hash
tables falls off exponentially fast.

272
00:21:00,980 --> 00:21:05,150
So in particular, the probability
calculation is, exactly what we went

273
00:21:05,150 --> 00:21:10,150
through before but instead of
having h+1 different hash tables,

274
00:21:10,150 --> 00:21:13,110
here we're looking at m+1.

275
00:21:13,110 --> 00:21:14,140
Or I guess, sorry.

276
00:21:14,140 --> 00:21:17,450
This really should just be n, not n+1,

277
00:21:17,450 --> 00:21:22,390
because we're not looking at
that specific contrived example.

278
00:21:22,390 --> 00:21:27,610
But the point still holds that,
as you increase the number of hash tables

279
00:21:27,610 --> 00:21:32,070
because you're raising some probability,
so some number less than 1, to higher and

280
00:21:32,070 --> 00:21:35,600
higher powers, you're getting lower and
lower probability at

281
00:21:35,600 --> 00:21:39,960
this exponentially fast rate of
not finding our nearest neighbor.

282
00:21:41,090 --> 00:21:44,520
And typically in many different
setups we're going to consider,

283
00:21:44,520 --> 00:21:51,160
you can show that the probability of
finding a nearest neighbor after searching

284
00:21:51,160 --> 00:21:55,880
one bin in m different hash tables,
so again in total m different bins,

285
00:21:55,880 --> 00:22:02,590
is higher than if we were to search
m bins in a single hash table.

286
00:22:02,590 --> 00:22:06,340
Okay, so to summarize what we've
shown in this optional video,

287
00:22:06,340 --> 00:22:11,230
is that when we think about searching
multiple bins within a single hash table,

288
00:22:12,660 --> 00:22:15,180
the cost of binning the points is lower.

289
00:22:15,180 --> 00:22:17,780
You only have to form one hash table.

290
00:22:17,780 --> 00:22:22,710
But typically you tend to have to search
more bins to have the same quality of

291
00:22:22,710 --> 00:22:29,100
approximation as if you performed
multiple binnings of points.

292
00:22:29,100 --> 00:22:33,810
So, look at multiple hash
tables which cost more.

293
00:22:33,810 --> 00:22:37,130
You have to create all of
these different hash tables.

294
00:22:37,130 --> 00:22:40,482
But tends to provide
a better approximation for

295
00:22:40,482 --> 00:22:42,669
our nearest neighbor search.

296
00:22:42,669 --> 00:22:46,870
[MUSIC]