1 00:00:00,620 --> 00:00:06,560 Okay, welcome to unit 4B. Before 2 00:00:06,560 --> 00:00:12,750 moving on from database technology onto learning, we're going to 3 00:00:12,750 --> 00:00:19,027 cover one more emerging aspect that goes beyond map-reduce. 4 00:00:20,690 --> 00:00:22,260 In particular we're going to talk about 5 00:00:22,260 --> 00:00:25,860 graph based parallel computing using the Pregel model. 6 00:00:26,960 --> 00:00:29,440 If you recall, our guest lecturer talked 7 00:00:29,440 --> 00:00:34,740 about various types of graph based problems 8 00:00:34,740 --> 00:00:38,380 which are important, like computing shortest paths, 9 00:00:38,380 --> 00:00:40,410 and Steiner Trees, and things like that. 10 00:00:41,564 --> 00:00:46,638 essentially what's happening is that for many of these problems 11 00:00:46,638 --> 00:00:51,970 which are quite important in emerging applications the map-reduce 12 00:00:51,970 --> 00:00:54,214 model has some limitations. 13 00:00:54,214 --> 00:00:57,085 We discuss some of these in the last 14 00:00:57,085 --> 00:01:01,268 lecture and we'll revise those as we go along. 15 00:01:01,268 --> 00:01:06,668 Just let's remember, map-reduce one more time, we have 16 00:01:06,668 --> 00:01:11,968 a situation where we have mappers, where mappers essentially 17 00:01:11,968 --> 00:01:17,368 read the data at random, and decide based on your map function, 18 00:01:17,368 --> 00:01:20,974 which key to sort this, sort this data on. 19 00:01:20,974 --> 00:01:23,614 And then the reducers, get the data, sorted by 20 00:01:23,614 --> 00:01:25,910 the key that you decided in the map function. 21 00:01:26,920 --> 00:01:32,740 and the platform does the sorting, and the reducers essentially process 22 00:01:32,740 --> 00:01:38,480 the data for every key in parallel and then output the data. 23 00:01:38,480 --> 00:01:43,560 This is the map-reduce, model. So let's revisit some of the 24 00:01:43,560 --> 00:01:49,190 challenges and, potentional solutions that we, looked at awhile back. 25 00:01:49,190 --> 00:01:53,240 Many applications require, repeated map-reduce. 26 00:01:53,240 --> 00:01:55,840 Page rank will be one example we'll talk about a little bit later. 27 00:01:57,160 --> 00:01:59,905 So there're many ways of trying to optimize 28 00:01:59,905 --> 00:02:03,430 map-reduce if you have to repeat it many times. 29 00:02:03,430 --> 00:02:05,470 One is to make it more efficient 30 00:02:05,470 --> 00:02:09,340 by avoiding copying data between successive reduce and 31 00:02:09,340 --> 00:02:09,990 map phases. 32 00:02:11,120 --> 00:02:15,500 Second is to generalize the model itself by using arbitrary 33 00:02:15,500 --> 00:02:20,000 combinations of map and reduce and pipelining of data between them. 34 00:02:20,000 --> 00:02:24,878 And the third is a direct implementation of recursion in a map-reduce like 35 00:02:24,878 --> 00:02:31,440 environment, the graph model is one such model, and there many examples of it. 36 00:02:31,440 --> 00:02:34,940 Which we'll talk about, very briefly. also, 37 00:02:36,510 --> 00:02:38,720 and then there's another model called stream model. 38 00:02:38,720 --> 00:02:41,226 Which is, being developed at Standford. 39 00:02:41,226 --> 00:02:47,380 So lets, lets, lets delve into the Pregel model with an example, and how 40 00:02:47,380 --> 00:02:52,370 we might do it in map-reduce as well as an alternative graph based model. 41 00:02:55,170 --> 00:02:59,900 The problem that we consider is the matrix-vector multiplication. 42 00:02:59,900 --> 00:03:01,670 Now, for those of you who've forgotten 43 00:03:01,670 --> 00:03:06,610 your linear algebra a matrix looks something like 44 00:03:06,610 --> 00:03:11,470 this, and a vector is a one-dimensional 45 00:03:11,470 --> 00:03:13,910 version, whereas a matrix is a two-dimensional array. 46 00:03:14,910 --> 00:03:16,890 And when you do matrix vector multiplication, what 47 00:03:16,890 --> 00:03:21,200 you're really doing is taking a row and 48 00:03:21,200 --> 00:03:24,420 multiplying every element of this row by the corresponding 49 00:03:24,420 --> 00:03:27,850 element of this vector, and then summing everything up. 50 00:03:27,850 --> 00:03:31,954 So, the first element of this product will be 51 00:03:31,954 --> 00:03:36,800 minus 1 times 1, plus 2, which will be 1. 52 00:03:36,800 --> 00:03:41,876 Similarly, the second element will be 1 times 1 minus 53 00:03:41,876 --> 00:03:46,412 2 plus 3, which will be 2, a third element will 54 00:03:46,412 --> 00:03:51,488 be minus 1 times 2 plus 3 which will be 1, and the last 55 00:03:51,488 --> 00:03:56,711 will be minus 1 times 3 minus 4 which will be minus 1. 56 00:03:58,290 --> 00:04:01,830 Now let's see how one might implement this in map-reduce. 57 00:04:01,830 --> 00:04:04,620 Assuming that we have a very large matrix and a 58 00:04:04,620 --> 00:04:07,350 very large vector we would like to do this in parallel. 59 00:04:09,280 --> 00:04:11,600 In map-reduce we would possibly 60 00:04:11,600 --> 00:04:16,680 do it like this. Take each element a, 61 00:04:16,680 --> 00:04:21,910 i, j Which is a whole bunch of elements in any, any 62 00:04:21,910 --> 00:04:26,960 order in various different files, and they have to somehow go to some mapper. 63 00:04:28,170 --> 00:04:32,076 Once they get the mapper, the mappers are going to 64 00:04:32,076 --> 00:04:35,175 sort these, by column of A and row of x. 65 00:04:35,175 --> 00:04:36,704 So that means 66 00:04:36,704 --> 00:04:41,717 is that a mapper will emit the key j for aij. 67 00:04:41,717 --> 00:04:45,327 If it reads something called aij it'll emit j 68 00:04:45,327 --> 00:04:48,272 as its key, telling it to go to the jth 69 00:04:48,272 --> 00:04:52,072 reducer, and for any element of x, it'll just 70 00:04:52,072 --> 00:04:55,701 tell the jth element to go to the jth reducer. 71 00:04:55,701 --> 00:04:58,204 [BLANK_AUDIO] 72 00:04:58,204 --> 00:05:02,962 Obviously each real reducer has a bunch of j's 73 00:05:02,962 --> 00:05:07,964 which it gets, so each reducer will get 74 00:05:07,964 --> 00:05:12,856 a bunch of columns of A, and a bunch of rows of x. 75 00:05:12,856 --> 00:05:18,315 When it, when a reducer j gets all it's elements, it computes 76 00:05:18,315 --> 00:05:23,259 the partial product, aij times xj, since it's getting 77 00:05:23,259 --> 00:05:29,027 the values of aij for every j it's also getting, for a bunch of 78 00:05:29,027 --> 00:05:34,756 j's, the xj's for a bunch of j's, it just multiplies them up. 79 00:05:34,756 --> 00:05:37,034 For every i. 80 00:05:37,034 --> 00:05:42,546 In other words, the jx reducer will compute say if j is equal to 81 00:05:42,546 --> 00:05:48,164 1, will compute minus 1 times 1, 1 times 1, 0 times 2, 82 00:05:48,164 --> 00:05:51,646 er times 1, 0 times 1, et cetera. 83 00:05:51,646 --> 00:05:55,104 So it'll compute everything for this column 84 00:05:55,104 --> 00:05:58,260 multiplied by the first element of x. 85 00:05:58,260 --> 00:06:03,580 Similiarly, reducer number two will compute all the multiplications of the 86 00:06:03,580 --> 00:06:07,120 second column with the second row of x and so on. 87 00:06:09,610 --> 00:06:13,809 Now, that's not enough, because we still have to add these up. 88 00:06:15,440 --> 00:06:16,710 This way. 89 00:06:16,710 --> 00:06:20,328 So the second map-reduce will do the following. 90 00:06:20,328 --> 00:06:23,804 It will distribute these partial products which are 91 00:06:23,804 --> 00:06:26,490 now columns, so each reducer has a bunch 92 00:06:26,490 --> 00:06:29,729 of columns after they've been multiplied by the 93 00:06:29,729 --> 00:06:33,295 appropriate element of x, and distributed by rows. 94 00:06:33,295 --> 00:06:34,795 For example 95 00:06:34,795 --> 00:06:39,595 the partial product yij in the vector y 96 00:06:39,595 --> 00:06:44,280 prime j, will get sent to reducer i. 97 00:06:46,020 --> 00:06:50,570 And then these will get added up, for example, so the idea is that 98 00:06:52,600 --> 00:06:55,920 these now will get distributed by rows, so part of 99 00:06:55,920 --> 00:06:59,650 this will go to one reducer, say the first two 100 00:06:59,650 --> 00:07:02,420 rows, the second two rows will go to another reducer, 101 00:07:02,420 --> 00:07:04,310 and then they'll get added up over there in parallel. 102 00:07:05,870 --> 00:07:11,640 Essentially what we've done, if you think about this carefully, is we have written 103 00:07:11,640 --> 00:07:20,870 this matrix vector product as a block multiplication. 104 00:07:20,870 --> 00:07:25,580 So for example, you can write A as a bunch of columns or sets of 105 00:07:25,580 --> 00:07:30,880 columns, so A1 could be individual columns or A1 could be a bunch of columns. 106 00:07:30,880 --> 00:07:33,340 And similarly x1 could be an individual element of x 107 00:07:33,340 --> 00:07:36,280 or it could be a bunch of elements of x. 108 00:07:36,280 --> 00:07:37,110 or rows of x. 109 00:07:37,110 --> 00:07:41,760 And if you write this matrix-vector product AX as A1 A2 up 110 00:07:41,760 --> 00:07:45,570 to AP, where P is the number of processors that we have. 111 00:07:45,570 --> 00:07:46,190 And X1 112 00:07:46,190 --> 00:07:47,350 up to XP. 113 00:07:47,350 --> 00:07:50,400 And obviously if you have P processors you'd have to 114 00:07:50,400 --> 00:07:54,890 use appropriate bunches of columns of A and rows of X. 115 00:07:54,890 --> 00:08:00,292 And then you just do a similar matrix vec, vector multiplication, here it becomes a1 116 00:08:00,292 --> 00:08:02,364 x1 plus a2 x2 et cetera, and so 117 00:08:02,364 --> 00:08:05,890 essentially it's the sum of these partial products. 118 00:08:05,890 --> 00:08:11,660 The partial products are exactly what reducer p computes. 119 00:08:11,660 --> 00:08:14,860 So the proccesser p computes this partial product, and then, in the 120 00:08:14,860 --> 00:08:18,250 first phase of map-reduce and the second phase they all get added up. 121 00:08:19,290 --> 00:08:23,620 Study this carefully because it's kind of important 122 00:08:23,620 --> 00:08:27,170 to understand how matrix-vector multiplications are done using map-reduce. 123 00:08:27,170 --> 00:08:28,840 This is just one way there are other 124 00:08:28,840 --> 00:08:33,870 more efficient ways of doing, matrix-vector products using map-reduce. 125 00:08:33,870 --> 00:08:37,110 But now it's come to a question of how, 126 00:08:37,110 --> 00:08:41,400 we would repeat this multiple times? If we had to do so. 127 00:08:41,400 --> 00:08:45,790 And we actually have to do so using in, in applications like computing page ranks. 128 00:08:46,810 --> 00:08:47,780 But there are problems.