1 00:00:02,120 --> 00:00:08,462 To go into some detail, let's take a look at how one might do locality sensitive 2 00:00:08,462 --> 00:00:13,220 hashing for fingerprint matching. This example is covered in 3 00:00:13,220 --> 00:00:17,660 Rajaraman/Ullman so you can read about it there as well. 4 00:00:20,580 --> 00:00:24,519 Fingerprints match if their minutiae match. 5 00:00:24,519 --> 00:00:32,030 Now, minutiae is a technical term in the world of fingerprints, which I don't claim 6 00:00:32,030 --> 00:00:37,619 to understand in great detail. But things like islands, ridges, 7 00:00:37,619 --> 00:00:43,940 bifurcations, cores, crossovers, these are the terms used in that field. 8 00:00:44,240 --> 00:00:51,843 We won't get into the details to the different types of minutiae, instead we 9 00:00:51,843 --> 00:01:00,146 will only worry about whether a particular print has some minutiae in a particular 10 00:01:00,146 --> 00:01:07,850 grid point, So we assume grid of cells to be placed on the fingerprint and. 11 00:01:07,850 --> 00:01:14,225 We only measuring whether or not. A cell contains, a minutia. 12 00:01:14,225 --> 00:01:19,434 Of some kind. Now, we define a function f(x) for a 13 00:01:19,434 --> 00:01:28,195 print'x', which is one if the print has minutia in a specified set of'k' mid 14 00:01:28,195 --> 00:01:33,036 positions. So function'f' depends on these 15 00:01:33,036 --> 00:01:40,183 particular'k' positions. We'll use'k' is equal to three in the 16 00:01:40,183 --> 00:01:46,350 example going forward. Notice that the function f is dependent on 17 00:01:46,350 --> 00:01:51,360 the choice of the k-grid positions. If you choose a different set of grid 18 00:01:51,360 --> 00:01:54,380 positions, you'll get a different function f. 19 00:01:55,660 --> 00:02:00,740 Now let's consider the probability that any print. 20 00:02:01,100 --> 00:02:07,007 Has a probab, has a menu shay, in a particular position, let that be P. 21 00:02:07,007 --> 00:02:13,871 Not every print has menu shay in every position, but across all possible prints. 22 00:02:13,871 --> 00:02:18,649 A particular position has menu shay with probability P. 23 00:02:18,649 --> 00:02:24,731 Lets assume that the distribution is uniform and the probability is P. 24 00:02:24,731 --> 00:02:30,900 Not necessarily a great assumption but for our example, it will suffice. 25 00:02:33,180 --> 00:02:42,141 Now, the probability that f of x is equal to one, that is that the print x has 26 00:02:42,141 --> 00:02:49,940 minutiae in all k positions. Is p raised to the power of k. 27 00:02:51,360 --> 00:03:00,123 So if you take the probability that a particular cell has, menu shay is 0.2, the 28 00:03:00,123 --> 00:03:08,998 probability of that, a set of three chosen cells, all have menu shay, is obviously 29 00:03:08,998 --> 00:03:13,880 0.2 raise to the power three which is 0.008. 30 00:03:14,140 --> 00:03:19,800 Now let's consider another print, y, but from the same person. 31 00:03:20,740 --> 00:03:27,562 Its quiet likely that this print will have when you say in the same position as X, 32 00:03:27,562 --> 00:03:33,385 but its not always the case. So lets assign a probability Q, it will be 33 00:03:33,385 --> 00:03:40,208 quiet high that the print Y will have when you say, if X also does, in a particular 34 00:03:40,208 --> 00:03:46,440 grid cell. Now let's look at the function F. 35 00:03:47,360 --> 00:03:53,952 What is the probability that F of X is one and F of Y is also one? 36 00:03:53,952 --> 00:04:02,042 Well, first of all the probability that F of X is one is P to the K, so that needs 37 00:04:02,042 --> 00:04:03,940 to happen. But then. 38 00:04:04,860 --> 00:04:12,502 You have the probability that. Y also has to have menu shay in the same K 39 00:04:12,502 --> 00:04:19,369 position, so you multiply it by Q, K times, so you get P Q to the K. 40 00:04:19,369 --> 00:04:26,870 Q is 0.9 that means there is 90 percent chance that Y will have menu shay if X 41 00:04:26,870 --> 00:04:34,160 also does in a particular cell, then P Q to the three, works out 2.006. 42 00:04:34,700 --> 00:04:45,214 Well, that's not so nice that both x and y get a yes match with the function f is 43 00:04:45,214 --> 00:04:51,080 only. Happening with the probability.006.