Okay, welcome to unit 4B.
Before
moving on from database technology onto
learning, we're going to
cover one more emerging aspect that goes
beyond map-reduce.
In particular we're going to talk about
graph based parallel computing using the
Pregel model.
If you recall, our guest lecturer talked
about various types of graph based
problems
which are important, like computing
shortest paths,
and Steiner Trees, and things like that.
essentially what's happening is that for
many of these problems
which are quite important in emerging
applications the map-reduce
model has some limitations.
We discuss some of these in the last
lecture and we'll revise those as we go
along.
Just let's remember, map-reduce one more
time, we have
a situation where we have mappers, where
mappers essentially
read the data at random, and decide based
on your map function,
which key to sort this, sort this data on.
And then the reducers, get the data,
sorted by
the key that you decided in the map
function.
and the platform does the sorting, and the
reducers essentially process
the data for every key in parallel and
then output the data.
This is the map-reduce, model.
So let's revisit some of the
challenges and, potentional solutions that
we, looked at awhile back.
Many applications require, repeated
map-reduce.
Page rank will be one example we'll talk
about a little bit later.
So there're many ways of trying to
optimize
map-reduce if you have to repeat it many
times.
One is to make it more efficient
by avoiding copying data between
successive reduce and
map phases.
Second is to generalize the model itself
by using arbitrary
combinations of map and reduce and
pipelining of data between them.
And the third is a direct implementation
of recursion in a map-reduce like
environment, the graph model is one such
model, and there many examples of it.
Which we'll talk about, very briefly.
also,
and then there's another model called
stream model.
Which is, being developed at Standford.
So lets, lets, lets delve into the Pregel
model with an example, and how
we might do it in map-reduce as well as an
alternative graph based model.
The problem that we consider is the
matrix-vector multiplication.
Now, for those of you who've forgotten
your linear algebra a matrix looks
something like
this, and a vector is a one-dimensional
version, whereas a matrix is a
two-dimensional array.
And when you do matrix vector
multiplication, what
you're really doing is taking a row and
multiplying every element of this row by
the corresponding
element of this vector, and then summing
everything up.
So, the first element of this product will
be
minus 1 times 1, plus 2, which will be 1.
Similarly, the second element will be 1
times 1 minus
2 plus 3, which will be 2, a third element
will
be minus 1 times 2 plus 3 which will be 1,
and the last
will be minus 1 times 3 minus 4 which will
be minus 1.
Now let's see how one might implement this
in map-reduce.
Assuming that we have a very large matrix
and a
very large vector we would like to do this
in parallel.
In map-reduce we would possibly
do it like this.
Take each element a,
i, j Which is a whole bunch of elements in
any, any
order in various different files, and they
have to somehow go to some mapper.
Once they get the mapper, the mappers are
going to
sort these, by column of A and row of x.
So that means
is that a mapper will emit the key j for
aij.
If it reads something called aij it'll
emit j
as its key, telling it to go to the jth
reducer, and for any element of x, it'll
just
tell the jth element to go to the jth
reducer.
[BLANK_AUDIO]
Obviously each real reducer has a bunch of
j's
which it gets, so each reducer will get
a bunch of columns of A, and a bunch of
rows of x.
When it, when a reducer j gets all it's
elements, it computes
the partial product, aij times xj, since
it's getting
the values of aij for every j it's also
getting, for a bunch of
j's, the xj's for a bunch of j's, it just
multiplies them up.
For every i.
In other words, the jx reducer will
compute say if j is equal to
1, will compute minus 1 times 1, 1 times
1, 0 times 2,
er times 1, 0 times 1, et cetera.
So it'll compute everything for this
column
multiplied by the first element of x.
Similiarly, reducer number two will
compute all the multiplications of the
second column with the second row of x and
so on.
Now, that's not enough, because we still
have to add these up.
This way.
So the second map-reduce will do the
following.
It will distribute these partial products
which are
now columns, so each reducer has a bunch
of columns after they've been multiplied
by the
appropriate element of x, and distributed
by rows.
For example
the partial product yij in the vector y
prime j, will get sent to reducer i.
And then these will get added up, for
example, so the idea is that
these now will get distributed by rows, so
part of
this will go to one reducer, say the first
two
rows, the second two rows will go to
another reducer,
and then they'll get added up over there
in parallel.
Essentially what we've done, if you think
about this carefully, is we have written
this matrix vector product as a block
multiplication.
So for example, you can write A as a bunch
of columns or sets of
columns, so A1 could be individual columns
or A1 could be a bunch of columns.
And similarly x1 could be an individual
element of x
or it could be a bunch of elements of x.
or rows of x.
And if you write this matrix-vector
product AX as A1 A2 up
to AP, where P is the number of processors
that we have.
And X1
up to XP.
And obviously if you have P processors
you'd have to
use appropriate bunches of columns of A
and rows of X.
And then you just do a similar matrix vec,
vector multiplication, here it becomes a1
x1 plus a2 x2 et cetera, and so
essentially it's the sum of these partial
products.
The partial products are exactly what
reducer p computes.
So the proccesser p computes this partial
product, and then, in the
first phase of map-reduce and the second
phase they all get added up.
Study this carefully because it's kind of
important
to understand how matrix-vector
multiplications are done using map-reduce.
This is just one way there are other
more efficient ways of doing,
matrix-vector products using map-reduce.
But now it's come to a question of how,
we would repeat this multiple times?
If we had to do so.
And we actually have to do so using in, in
applications like computing page ranks.
But there are problems.