1 00:00:00,170 --> 00:00:01,650 So let's see what happens when you 2 00:00:01,650 --> 00:00:06,810 try to iterate MapReduce based matrix vector multiplication. 3 00:00:06,810 --> 00:00:08,520 First time of course you have to read the 4 00:00:08,520 --> 00:00:12,520 matrix A, and read the vector X in parallel 5 00:00:12,520 --> 00:00:15,620 by all the mappers, then the intermediate state, rather 6 00:00:15,620 --> 00:00:21,020 after the first MapReduce phase, you'll write the Y prime. 7 00:00:21,020 --> 00:00:25,580 And then you have to read the Y primes, and then at the end you will write the 8 00:00:25,580 --> 00:00:29,588 Y's all in parallel, because every reducer computes a few rows of the the final Y. 9 00:00:29,588 --> 00:00:34,430 But now, you have to once again you have read the Matrix A 10 00:00:35,440 --> 00:00:39,830 and the Y's that you have written, in the previous iteration. 11 00:00:41,410 --> 00:00:46,040 Write Y prime, write Y again and continuously read A again and read Y. 12 00:00:46,040 --> 00:00:50,100 So, there're two places where you're reading things again and again. 13 00:00:50,100 --> 00:00:51,000 One is here 14 00:00:51,000 --> 00:00:56,150 your writing Y prime and, and reading Y prime again which you can potentially 15 00:00:56,150 --> 00:01:02,535 avoid by pipelining using technique called like HaLoop and other implementations. 16 00:01:02,535 --> 00:01:06,260 Where you can pipeline the writes and the reads from the, writes of 17 00:01:06,260 --> 00:01:10,482 one reducer to the reads of the next mapper, in the next phase. 18 00:01:10,482 --> 00:01:14,200 But the, we still have a problem having to read A, again. 19 00:01:16,500 --> 00:01:19,060 Notice MapReduce does not have any memory. 20 00:01:20,330 --> 00:01:23,740 It, the basic assumption is that the data is just too large, it will 21 00:01:23,740 --> 00:01:26,120 never fit in memory, and so you 22 00:01:26,120 --> 00:01:28,230 essentially have to read A again and again. 23 00:01:30,180 --> 00:01:35,860 But suppose it did fit in memory. MapReduce still has no memory, 24 00:01:37,450 --> 00:01:42,090 so it can't really remember A between iterations the mappers 25 00:01:42,090 --> 00:01:43,650 don't know what they're going to get next time. 26 00:01:43,650 --> 00:01:46,270 It's a random order of reads every time. 27 00:01:46,270 --> 00:01:50,480 And so MapReduce iterations are inherently inefficient in this regard. 28 00:01:50,480 --> 00:01:54,570 That if A is really a huge matrix, lots and lots of edges, or rather lots 29 00:01:54,570 --> 00:01:58,510 and lots of entries, then it's, it's just going to have to read it again and again. 30 00:02:00,580 --> 00:02:03,780 So this is definitely a problem, and there is 31 00:02:03,780 --> 00:02:06,095 a way around it, which is essentially leads us to 32 00:02:06,095 --> 00:02:06,260 [UNKNOWN]. 33 00:02:06,260 --> 00:02:06,760 Let's 34 00:02:09,780 --> 00:02:14,900 look at the same matrix that your operation now as a graph operation. 35 00:02:14,900 --> 00:02:20,200 And what do I mean by that? Think of this matrix as defining 36 00:02:22,610 --> 00:02:26,480 edges in a graph, so for example, the fact you have, 37 00:02:29,510 --> 00:02:35,830 the, the first node, which is the first column, or, has 38 00:02:35,830 --> 00:02:40,250 an edge which leaves it and goes to the second 39 00:02:43,010 --> 00:02:43,510 vertex. 40 00:02:44,530 --> 00:02:50,430 And similarly there's a edge going from the second vertex to the first vertex. 41 00:02:50,430 --> 00:02:57,370 So, read each column as the edges that go out of, say the first vertex, 42 00:02:57,370 --> 00:03:02,120 and read each row as the edges that come in to the first vertex for example. 43 00:03:02,120 --> 00:03:06,530 So, here you have minus one is just about X itself. 44 00:03:06,530 --> 00:03:08,330 Diagonals are all minus one, and since 45 00:03:08,330 --> 00:03:10,470 there's a vertex, there's an edge which comes in from 46 00:03:10,470 --> 00:03:13,400 two, and there's an edge which goes out from one. 47 00:03:14,440 --> 00:03:17,130 Now, let's take a look at three. 48 00:03:17,130 --> 00:03:23,250 It has two edges going out. One to four, which is this entry. 49 00:03:23,250 --> 00:03:29,740 And one to two, which is this entry. And it has one edge coming in 50 00:03:29,740 --> 00:03:33,670 from four, which is this edge over here. And you can verify that this particular 51 00:03:33,670 --> 00:03:37,300 matrix, which is called the adjacency matrix of 52 00:03:37,300 --> 00:03:40,610 this graph, represents all the edges in this graph. 53 00:03:40,610 --> 00:03:45,700 So, the diagram represents, just the vertex, and we treat them as minus one. 54 00:03:46,860 --> 00:03:49,200 for reasons which we won't go into in this course. 55 00:03:50,246 --> 00:03:55,330 and the off diagonal entries represent edges in this graph. 56 00:03:57,260 --> 00:03:59,190 Now what about the vector X itself? 57 00:03:59,190 --> 00:04:05,904 Well we assume each vertex holds a value, which is nothing but the element 58 00:04:05,904 --> 00:04:12,200 of the x vector in the appropriate position corresponding to that vertex. 59 00:04:13,230 --> 00:04:17,450 So the vector 1, 2, 3, 4, is distributed across the vertices of 60 00:04:17,450 --> 00:04:23,000 the graph as just 1, 2, 3, 4, which is one element per vertex. 61 00:04:23,000 --> 00:04:24,370 Now let's think of the matrix vector 62 00:04:24,370 --> 00:04:32,270 multiplication as being done in parallel by each vertex, but in the following way. 63 00:04:34,520 --> 00:04:39,190 Each vertex sends its value out, along all its outward edges. 64 00:04:39,190 --> 00:04:41,890 So two sends its value to these two vertices, 65 00:04:41,890 --> 00:04:44,200 and one sends it to this one and so on. 66 00:04:44,200 --> 00:04:48,800 And then all the vertexes start receiving the values that were sent in this step, 67 00:04:50,870 --> 00:04:57,040 and they receive values from all their neighbors on incoming edges, and finally 68 00:04:57,040 --> 00:05:02,570 the vertex replaces its value by the sum of everything it receives 69 00:05:02,570 --> 00:05:05,320 minus the value that it originally had, which is essentially as you 70 00:05:05,320 --> 00:05:08,940 can easily see, It performs exactly 71 00:05:08,940 --> 00:05:11,110 the matrix-vector multiplication that was going on. 72 00:05:12,880 --> 00:05:16,340 Now, the interesting part of this is a, it's very parallel 73 00:05:16,340 --> 00:05:19,570 in the sense that as long as you know, you have many processors, and each 74 00:05:19,570 --> 00:05:22,590 processor has bunches of vertices, you can really 75 00:05:22,590 --> 00:05:24,333 do a lot of this operation in parallel. 76 00:05:24,333 --> 00:05:29,160 The only caveat is that, you know, you have to exchange your 77 00:05:29,160 --> 00:05:34,050 messages, so you try to make sure that the vertices are placed so 78 00:05:34,050 --> 00:05:38,690 that they don't have to exchange messages between different processors too often, 79 00:05:38,690 --> 00:05:42,120 and that's a challenge, as we'll come to that in a short while. 80 00:05:42,120 --> 00:05:46,051 But the really important part of this is that it's easy to iterate. 81 00:05:47,080 --> 00:05:49,190 Once the graph's structure has been loaded 82 00:05:49,190 --> 00:05:52,400 into, all the processors so that every vertex 83 00:05:52,400 --> 00:05:55,540 knows its outgoing edges then you don't, 84 00:05:55,540 --> 00:05:58,700 don't need to load, this matrix structure again. 85 00:05:58,700 --> 00:06:02,084 You can just keep iterating, again and again, and repeating 86 00:06:02,084 --> 00:06:07,230 the matrix-vector products As is often required in many applications. 87 00:06:08,305 --> 00:06:13,530 Unlike MapReduce, where you have to load a continuously of iteration 88 00:06:13,530 --> 00:06:17,990 is loaded once, since all A is, is just a 89 00:06:17,990 --> 00:06:22,080 vertex and it's, connections to neighbors, that's all. 90 00:06:23,670 --> 00:06:30,260 So the caveat of course is that if the number of edges is so huge that, 91 00:06:32,480 --> 00:06:34,970 even using all the memory of all the processors that 92 00:06:34,970 --> 00:06:37,170 you have, you can't load it, then you can't do this. 93 00:06:37,170 --> 00:06:40,140 So the caveat for Pregel, this is the Pregel model 94 00:06:40,140 --> 00:06:42,500 actually, we will come to that formally in a minute, 95 00:06:42,500 --> 00:06:46,590 the caveat is that you need to have enough memory 96 00:06:46,590 --> 00:06:49,820 to load the entire graph in order for this to work.