So let's see what happens when you
try to iterate MapReduce based matrix
vector multiplication.
First time of course you have to read the
matrix A, and read the vector X in
parallel
by all the mappers, then the intermediate
state, rather
after the first MapReduce phase, you'll
write the Y prime.
And then you have to read the Y primes,
and then at the end you will write the
Y's all in parallel, because every reducer
computes a few rows of the the final Y.
But now, you have to once again you have
read the Matrix A
and the Y's that you have written, in the
previous iteration.
Write Y prime, write Y again and
continuously read A again and read Y.
So, there're two places where you're
reading things again and again.
One is here
your writing Y prime and, and reading Y
prime again which you can potentially
avoid by pipelining using technique called
like HaLoop and other implementations.
Where you can pipeline the writes and the
reads from the, writes of
one reducer to the reads of the next
mapper, in the next phase.
But the, we still have a problem having to
read A, again.
Notice MapReduce does not have any memory.
It, the basic assumption is that the data
is just too large, it will
never fit in memory, and so you
essentially have to read A again and
again.
But suppose it did fit in memory.
MapReduce still has no memory,
so it can't really remember A between
iterations the mappers
don't know what they're going to get next
time.
It's a random order of reads every time.
And so MapReduce iterations are inherently
inefficient in this regard.
That if A is really a huge matrix, lots
and lots of edges, or rather lots
and lots of entries, then it's, it's just
going to have to read it again and again.
So this is definitely a problem, and there
is
a way around it, which is essentially
leads us to
[UNKNOWN].
Let's
look at the same matrix that your
operation now as a graph operation.
And what do I mean by that?
Think of this matrix as defining
edges in a graph, so for example, the fact
you have,
the, the first node, which is the first
column, or, has
an edge which leaves it and goes to the
second
vertex.
And similarly there's a edge going from
the second vertex to the first vertex.
So, read each column as the edges that go
out of, say the first vertex,
and read each row as the edges that come
in to the first vertex for example.
So, here you have minus one is just about
X itself.
Diagonals are all minus one, and since
there's a vertex, there's an edge which
comes in from
two, and there's an edge which goes out
from one.
Now, let's take a look at three.
It has two edges going out.
One to four, which is this entry.
And one to two, which is this entry.
And it has one edge coming in
from four, which is this edge over here.
And you can verify that this particular
matrix, which is called the adjacency
matrix of
this graph, represents all the edges in
this graph.
So, the diagram represents, just the
vertex, and we treat them as minus one.
for reasons which we won't go into in this
course.
and the off diagonal entries represent
edges in this graph.
Now what about the vector X itself?
Well we assume each vertex holds a value,
which is nothing but the element
of the x vector in the appropriate
position corresponding to that vertex.
So the vector 1, 2, 3, 4, is distributed
across the vertices of
the graph as just 1, 2, 3, 4, which is one
element per vertex.
Now let's think of the matrix vector
multiplication as being done in parallel
by each vertex, but in the following way.
Each vertex sends its value out, along all
its outward edges.
So two sends its value to these two
vertices,
and one sends it to this one and so on.
And then all the vertexes start receiving
the values that were sent in this step,
and they receive values from all their
neighbors on incoming edges, and finally
the vertex replaces its value by the sum
of everything it receives
minus the value that it originally had,
which is essentially as you
can easily see, It performs exactly
the matrix-vector multiplication that was
going on.
Now, the interesting part of this is a,
it's very parallel
in the sense that as long as you know, you
have many processors, and each
processor has bunches of vertices, you can
really
do a lot of this operation in parallel.
The only caveat is that, you know, you
have to exchange your
messages, so you try to make sure that the
vertices are placed so
that they don't have to exchange messages
between different processors too often,
and that's a challenge, as we'll come to
that in a short while.
But the really important part of this is
that it's easy to iterate.
Once the graph's structure has been loaded
into, all the processors so that every
vertex
knows its outgoing edges then you don't,
don't need to load, this matrix structure
again.
You can just keep iterating, again and
again, and repeating
the matrix-vector products As is often
required in many applications.
Unlike MapReduce, where you have to load a
continuously of iteration
is loaded once, since all A is, is just a
vertex and it's, connections to neighbors,
that's all.
So the caveat of course is that if the
number of edges is so huge that,
even using all the memory of all the
processors that
you have, you can't load it, then you
can't do this.
So the caveat for Pregel, this is the
Pregel model
actually, we will come to that formally in
a minute,
the caveat is that you need to have enough
memory
to load the entire graph in order for this
to work.