So here's the Pregel model. It's an internal project which is used within Google. It distributes vertices to processors or nodes in this diagram, initialization you partition the vertices across the processors and then vertices are given edges. So then they read edges and then edges are given to the appropriate vertex, so vertex will get all the edges which outgo from that vertex. And each processor will have a whole bunch of vertices. And then vertices will exchange messages as per the Pregel program, that is written by [UNKNOWN] application. The nodes will some messages will be exchanged within a node itself when the destination vertex is within that node and others will have to travel between nodes. When the destination vertex is actually on another processor. This coordination of messages precedes in what I call super steps, we will come to that in the later chart. So, after all nodes have sort of delivered and received one round of messages. The master ensures that there's a synchronization across all processors. And then the next step proceeds. So everybody's sends messages and then the messages get delivered. Then again it does some computation. There's, everybody sends messages, messages get delivered and so on and so forth until there are no more messages getting delivered. We'll, we'll see examples of algorithms implemented using this Pregel model in, in a, in a bit. There are different implementations, Pregel itself is internal to Google as we mentioned just a minute ago. Giraph is an open source implementation of the Pregel model built on top of Hadoop, but there's, they're essentially only mappers and not reducers and all the message parsing communication happens through remote procedure calls. It, like Pregel itself, Giraph distributes vertices across nodes. GPS is another, implementation of the, of the graph model, based on the Pregel paradigm. It also distributes vertices at, across processors, but it also takes care of appropriately repartitioning the vertices between processors to make sure there is less communication, between processors. So this is an important additional, feature that GPS does, for you whereas in Giraffe you have to do anything like that yourself. GraphLab is yet another, graph processing model. very similar, from the programming perspective, to Pregel, in terms of every vertex is a program, but, in here, edges are distributed across nodes. So the information about a single vertex is actually distributed across processors. this is quite an interesting model. Obviously the architecture is very different from, from this one, because you're not distributing nodes, to vertices to nodes, but edges. We won't go to the great details of the implementation architectures, but will look at a few applications of the Pregel model. to some interesting problems. So, here is the Pregel programming model. Just like we had the map produce programming model where you had to write a map function and a reduced function, here we have to write code for every vertex. Each vertex executes code once the graph is noded and the assumption is that the process of loading the graph makes sure that each vertex knows it's outgoing edges. It may not have any idea of what it's incoming edge is, but it knows its own outgoing edges. Vertices are distributed across processors, so each processor executes the code for all its vertices. So, it doesn't update for every vertex that it, has. in what is called a superstep. After the superstep, is completed, a second superstep starts. And so on until the program halts, as we shall see in a minute. So each vertex first receives messages from all its in-neighbors. So, who where is sending those messages on its incoming edges, that's some computation using those, these messages, and then decides to send some messages to its out-neighbors, possibly. After that, it decides whether or not to halt. Or it means, what a halt means is that a vertex stays inactive once it's halted. Unless it receives a message from some other vertex in the future. If this does not happen, and all vertices are halted, then the computation halts. So the master, as you mentioned in the previous, chart, is responsible for making sure that the messages are sent and received from vertices, as well as check it's, where there are not, all the vertices are in halt state. If they are, then it ends the computation. So let's do a couple of, more examples using the Pregel programming model to solve a few important problems. A simple one is called max-node. We're essentially given values across all the vertices, we would like to find out the maximum of these and make sure that every vertex at the end has the maximum value across all the vertices in the graph. So this is a simple program. You send your value along your outgoing edges and then start receiving values from incoming edges. For each incoming message, if you get a value which is greater than your value then you replace yourself with that value. And and continue and don't, don't halt but if, if you never got, if you did not get any message which was greater than your value then you halt. Let's see what happens if this is executed for every node. In the first step what happens is that these are bi directional edges, which are mentioned over here. So this means there's an edge from three to six as well as from six to three. In the first step three comes to six, six doesn't do anything be because it's already the maximum and it's never going to get a message greater than itself. But three becomes a six, right? Similarly two gets a message from one. but it doesn't change because one is less than two. But one gets six, so it changes. So, six and two did not change. And therefore they halt. So they're shown as shaded. So it sends so one sends its value to six, six does nothing with it six sends its value to two, which now wakes up because it, it was sleeping, it was halted, but it got an incoming message so it wakes up and this super, next super step and checks that, yes it did get a new value. So it becomes six and then it sends it's value out to it's outgoing edges which doesn't do anything over there. So at the end, all the vertices are still, are halted, the computation ends, and as you can see, all the vertices have the maximum value of six.