Having mastered computing the connected
components of undirected graph
in linear time, let's now turn our attention to
directed graphs which also arise, in all
kinds of different applications. Now the
good news is, is we'll be able to get just
a blazingly fast primitive for computing
connectivity information for directed
graphs. The bad news if you wanna call it
that, is we'll have to think a little bit
harder. So it won't be as obvious how to
do it, but by the end of this lecture you
will know a linear time algorithm with
very good constants, really just based on
depth-first search for computing
all of the pieces of a directed graph. In
fact it's not even so obvious how to
define pieces, how to define connected
components in a directed graph, Certainly
not as obvious as it was with undirected
graphs. So to see what I mean, let's
consider the following four node directed
graph. So on the one hand, this graph is
in some sense in one piece. If this was an
actual physical object made, say, of a
bunch of strings connected to each other
and we'd picked it up with our hands, it
wouldn't fall apart into two pieces. It
would hang together in one piece. On the
other hand, when you think about moving
around this network, it's not connected in
the sense that we might think about. You
cannot get from any one point A to any
other point B. For example, if you start
at the right most node in this graph,
certainly there's no directed path that
will get you to the left most node. So
what's typically studied and most useful
for directed graphs, is what you call
strong connectivity. The graph is strongly
connected if you can get from any one
point to any other point, And vice versa.
And the strongly connected components
then informally, are the maximal portions
of this graph, the maximal regions which
are internally, strongly connected. So,
the maximal regions from within which you
can get from any one point A to any other
point B along a directed graph. For the
record, let me go ahead and give you a, a
formal definition although, you know, this
intuition is perfectly valid just regions
where you can get from anywhere to
anywhere else. So we say that the strongly
connected components of the directed graph
or the SCCs for short. And as in the
undirected case, we're going to give a
somewhat slick definition, rather than
talking about maximal regions satisfying
some property, we're going to talk about
the equivalence classes of a particular
equivalence relation. But really it means
the same thing. This is just sort of the
more mathematically mature way of writing
it down. So the equivalence relation we're
going to use it on nodes of the graph and
we'll say that U, a node, is related to a
node V if you can get from U to V via a
directed path and also from V to U on some
other directed path. [sound] I'm not gonna
bore you with the verification that this
is an equivalent relation. That's
something that you can work out in the
privacy of your own home. Just remember
what it means to be in an equivalence
relation. It's reflexive. That is,
everybody's related to itself but, of
course, there is a path from every node to
itself. It also means that it's symmetric.
So if U is related to V, then V is related
to U. Well, again, by definition, we're
saying that the paths are mutually, the
vertices are mutually reachable from each
other. So that's clearly symmetric by
definition and then it has to be
transitive. And the way you prove it's
transitive, is you just paste paths
together and it just works the way you'd
expect it to. So let's illustrate this
concretely with a, a somewhat more
complicated directed graph. So here's a
directed graph and I claim that it has
exactly four strongly connected
components. There's a triangle on the
left, a triangle on the right. Has this
single node on top and then it has this
directed four-cycle with a diagonal at the
bottom. So what I hope is pretty clear is
that each of these circled regions is
indeed strongly connected that is if you
start from one node in one of these
circled regions you can have a directed
path to any other node, so that's
certainly true cause on a directed cycle you can
get from anyone starting point and the
other place and all of these have directed
cycles and there's the one strong
component that just has one node which is
obviously strongly connected. What I
also claim is true is that all of these
regions are maximal subject to being
strongly connected. That's why they're the
strongly connected components. That's why 
they're the equivalence classes of this equivalence
relation we just defined. So, if you take
any two pairs of nodes which lie in two
different circles you either won't have a
path from one... From the first one to the
second one or you won't have a directed
path from the second one back to the first
one. In fact, the structure of the strong
components in this black graph exactly
mirrors, the directed acyclic graph that
we started with in red. So in the same
way, in the red four node graph, you can't
move from right to left. Here in this
bigger graph, you can't move from any of
the circled SCC's to the right to any of
the circled SCCs to the left. So, for
example, from the rightmost nodes, there
are no directed paths to the leftmost
nodes. So that's a recap of the definition
of the strongly connected components, I've
motivated in a separate video some reasons
why you might care about computing
strongly connected components. In
particular, on extremely large graphs,
which motivates the need for blazingly
fast subroutines so for-free primitives
that will let you compute connectivity
information. So you'd love to be able to
know the pieces of a directed graph, maybe
you don't even know why they're going to
be useful, but you just compute them
because why not? It's a for-free primitive.
So that's what I'm gonna give you in this
lecture. So the algorithm we're gonna
present is based on depth-first search, and
that's gonna be the main reason why it's
going to be so blazingly fast because
depth-first search is blazingly fast. Now you
might be wondering what on earth does
graph search have to do with computing
components? They don't seem obviously
related, so let's return to the same
directed graph that I showed you on
the previous slide. And to see why something
like depth-first search might
conceivably have some use for computing
strong components. Suppose we call
depth-first search, starting from this red node
as a starting point. What would it
explore? So remember what the guarantee of
something like depth-first search or
breadth-first search, for that matter, is.
You find everything that's findable, but
naturally, nothing else. So what is
findable from this red node? Where, by
findable, I mean is you can reach it from
a directed path emanating from this red
node. Well, there's not much you can do,
so, from here, you can explore this arc,
and you can explore this arc, and then you
can go backward. And so if you do DFS
or BFS from this node you're going to find
precisely the nodes in this triangle.
All of the other arcs involved go the
wrong direction and they won't be
traversed by say a depth-first search
call. So, why is that interesting? What's
interesting is that if we invoke DFS from
this red node or any of the three nodes
from this triangle then it's going to
discover precisely this strongly connected
component. Precisely the three nodes in
this circled SCC. So that seems
really cool. It seems like maybe we just
do a DFS, and boom, we get an SCC. So
maybe if we can do that over and over
again, we'll get all the SCCs. So that's a
good initial intuition. But something can
go wrong. Suppose that instead of, an,
initiating DFS from one of these three
nodes on the triangle, we, say, initiate
it from this bottom-most node in green. So
remember, what is the guarantee of a graph
search subroutine like DFS? It will find
everything findable but of course nothing
more. So what's findable from this green
node? Well naturally, everything in its
own SCC. Right, so the four nodes here, it
will certainly find those four nodes. On
the other hand, if we start from this
green node, since there are arcs that go
from this bottom most SCC to the right
most SCC, not only will this DFS call find
the four nodes in the green node's strong
component, but it will also traverse these
blue arcs and discover the three nodes in
the red triangle. So, if we call DFS from
this green node, we'll capture all seven
of these. So the point is if we call DFS,
it looks like we're going to get a
union of possibly multiple SCCs. In fact,
in the worse case, if we invoke DFS from
the left-most node, what's it going to
discover? It's going to discover the
entire graph, and that didn't give us any
insight into the strong component
structure at all. So, what's the take away
point is, the take away point is if you
call DFS from just the right place, you'll
actually uncover an SCC. If you call it
from the wrong place, it will give you no
information at all, so the magic of the
algorithm that we're gonna discuss next is
we'll show how in a super slick
pre-processing step which ironically is
itself a call to depth-first search, we can in
linear time compute exactly where we want
to start the subsequent depth-first searches
from, so that each invocation gets us
exactly one strongly connected component, and nothing more. So
the algorithm that, that I'm going to show
you is due to Kosaraju. And it will show
the following theorem. That the strongly
connected components of a directed graph
can be computed in linear
time. And as we'll see, the constants are
also very small. It's really just gonna be
two passes of depth-first search. And again,
let me remind you that for many
problems it's natural to assume that the
number of edges is at least the number of
nodes because, if you're only interested
in cases where the graph is connected. Of
course, when you're computing connected
components, that's one of the most natural
cases where you might have a super sparse,
broken up graph. So we're not going to
assume m is at least n. So that's why
linear time is going to be m + n, because
that's the size of the input. And we don't
know, either M could be bigger than N, or
N could be bigger than M. We have no idea.
Kosaraju's algorithm is shocking in its
simplicity. It has three steps, let me
tell you what they are. First, very
mysteriously, we're going to reverse all
of the arcs of the given graph. Totally not
clear why that would be an interesting
thing to do yet. Then, were going to do
our first pass, our first depth-first search,
and we're going to do it on the reverse
graph. Now, the naive way to implement
this would be you literally construct a new
copy of the input graph, with all of the
arcs in the reverse direction. And then
just run depth-first search on it. Of
course, the sophisti--, the sort of
obvious optimization would be to just run
DFS on the original graph, but going
across arcs backward. So I'll let you
think through the details of how you'd do
that, but that just works. You run DFS,
and instead of going forward along edges,
you go backward along edges. That
simulates depth-first search on the
reverse graph. Now, I've written here, DFS
loop. And that just means the usual trick
where, to make sure that you see all of
the nodes of, of the graph, even if it's
disconnected, you have an outer loop,
where you just try each starting point
separately. If you haven't already seen
it, then you run DFS from that given node.
I'll be more detailed on the next slide.
And the third step, it's just you run
depth-first search again, but this time on
the original graph. Now at this point, you
should be thinking that I'm totally crazy,
right? So what are we trying to do? We're
trying to compute these strongly connected
components. We're trying to actually
compute real objects, these maximal
strongly connected regions. And all I'm doing is
searching the graph. And I do it once
forward, I do it once backward. I mean,
I'm not, it doesn't really seem like I'm
computing anything. So, here's the catch,
and it's a very minor catch. So we're
gonna need, to do a little bit of bookkeeping,
it's gonna be very little overhead, so we'll still
have a blazingly fast algorithm. So, but with a little bit of bookkeeping,
here's what's gonna happen. The second depth-first search, which searches the graph, will,
in its search process, discover the strongly connected components one at a time
in a very natural way, and that'll be really obvious when we do an example, which we'll do in just a second.
Now, for the second depth-first search to work in this magical way, where it just discovers the strongly connected
components one at a time, it's really
important that it executes the depth first
searches in a particular order, that it
goes through the nodes of the graph in a
particular order. And that is exactly the
job of the first pass. The depth-first
search on the reverse graph is going to
compute an ordering of the nodes, which,
when the second depth-first search goes
through them in that order, it will just
discover the SCCs one at a time in linear
time. So let me say a little bit more
about the form of the bookkeeping and then
I'll show you how that bookkeeping is
kept in as we do depth-first
search. So we're gonna have a notion of a
finishing time of a vertex and that's
gonna be computed in the first pass when
we do depth-first search in the reversed
graph. And we're going to make use of this
data in the second pass. So, rather than
just going through the nodes of the graph
in an arbitrary order, like we usually do
when we sort of have a looped
depth-first or breadth-first search, we're
going to make sure that we go through the
vertices in decreasing order of these
finishing times. Now there's still the
question in what sense does this second
depth-first search discover and report the
strongly connected components that it finds?
So we're going to label each node in this
second pass with what we call a leader and
the idea is that the nodes in the same
strongly connected components will be labeled
with exactly the same leader node. And
again all of this will be much more clear
once we do a concrete example. But I want
to have it down for the record right now.
So that's the algorithm at a high level
it's really just two passes of DFS with
some bookkeeping. But this is under
specified. You really, shouldn't,
understand how to implement the
algorithm just yet. So what do I owe you?
I owe you exactly what I mean by DFS loop
although this you've seen more or less in
the past. It's just a loop over all the
vertices the graph and then when if you
haven't seen something yet you DFS from
that starting point. I need to tell you
what finishing times are and how they get
computed. They're just gonna be integers
one to N. Which is basically when
depth-first search gets finished with one
of the nodes. And then I need to tell you
how to compute these leaders. So let me
show you all three of those things on
the next slide. So the workhorse for
Kosaraju's strongly connected components
algorithm is this DFS loop subroutine and
it takes as input in graph. Okay, so it
does not take as input a starting node,
it's gonna loop over possible starting
nodes. Now for the bookkeeping, to
compute finishing nodes, we're going to
keep track of a global variable that I'll
call T, which we initialize to zero. The
point of T is to count how many nodes we
have totally finished exploring at this
point. So this is the variable we use to
compute those finishing times in the first
pass. That magical ordering that I was
talking about. Now we're also gonna have a
second global variable to compute these
things I was calling leaders, and these
are only relevant for the second pass. So
what S is going to keep track of is the
most recent vertex from which a DFS was
initiated. So to keep the code simple I'm
just going to do all of the bookkeeping in
DFS loop, but really DFS loop gets
called twice, once in the reverse graph,
once in the forward graph and we only need
to compute these finishing times in the
first pass, on the reverse graph. And we
only need to compute these leaders of the
second pass for the forward graph. But
let's just keep them both in there, just
for kicks. Now, we're going to need to
loop through the vertices. And so the
question is, in what order are we going to
loop through the vertices? And that's
gonna happen differently in the two
different passes. But let me just use some
common notation. Let's just assume in this
subroutine, that the nodes are somehow
labeled from one to N. In our first depth-first
search it's gonna be labeled totally
arbitrary. So these are basically just the
names of the node, or their position in
the node array, whatever, you just do it
in some arbitrary order. Now the second
time we run DFS loop, as indicated on the
previous slide, we're gonna use the
finishing times as the labeling. And as
we'll see the finishing times are indeed
numbers between one and N. So now what we
do is we just iterate through the nodes in
decreasing order. And if we haven't
already seen node I, then we initiate a
DFS from it. So, as usual, we're gonna be
maintaining a local boolean to keep track
of whether we've already seen a node yet
in one of the DFS passes. Now remember,
the global variable S is responsible for
keeping track of the most recent node from
which depth-first search had been
initiated. So if I is not explored and we
initiate a depth-first search from it, we'd
better reset S. And then we do the usual
DFS in G, starting from the source node I.
So for completeness let me just remind you
what the depth-first search subroutine
looks like. So now we're given a graph and
a starting node. So the first time we see
a node, we mark it as explored. And just a
side note that once a node is marked
explored it's explored for this entire
invocation of DFS loop. Okay, so even if
this DFS from a given node I finishes and
then the outer for loop marches on, and
encounters I again it's still gonna be
marked as explored. Now
one of our bookkeeping jobs is to keep
track of, from which vertex did the DFS
that discovered I get called, so
when I is first encountered, we remember
that S was the node from which this DFS
originated, and that by definition is the
leader of I. And then we do what we always
do with depth-first search. We immediately
look at the arcs going out of I, and we
try to recursively DFS on any of those.
Although, of course, we don't bother to do
it if we've already seen those nodes. Now
once this for loop has completed, once we've
examined every outgoing arc from I and
for each node J either we already saw in
the past or we've recursively explored
from J and have returned, at that point we
call ourselves done with node I. There's
no more outgoing arcs to explore, we think
of it being finished. Remember T is the
global variable that's keeping track of how
many nodes we're done with. So we
increment T because now we're done with I and
we also remember that I was the Tth
vertex with which we finished. And that
is, we set I's finishing time to be T.
Because depth-first search first is
guaranteed to visit every node exactly
once, and therefore finish with every node
exactly once. This global counter T, well,
when the first node is finished, it'll be
value one. Then when the next node gets
finished, it'll have value two, then it'll
have value three and so on. When the final
node gets finished with, it'll have value
N. So the finishing times of the nodes are
gonna be exactly the integers from one to
N. Let's make this much more concrete by
going through a careful example. In fact,
I think it will be better for everybody if
you yourself trace through part of this
algorithm on a concrete example. So let me
draw a nine-node graph for you. So to
be clear let's assume that we've already
executed step one of the algorithm.
That is, we've already reversed the
graph. So that is. This blue graph that
I've drawn on the slide. This is the
reversal. We've already reversed the arcs.
Moreover the nodes are labeled in some
arbitrary way from one to nine. Just
assume these are how they show up in the
node array for example. And remember in
the DFS loop routine we're supposed to
process the nodes, from the, from top to
bottom. From N down to 1. So my
question for you then is, in the second
step of the algorithm when we run DFS loop
and we process the nodes from the highest
name 9 in order down to the lowest name
1. What are the finishing times that
we're going to compute as we run DFS loop?
Now, it is true that you can get different
finishing times depending on the different
choices that the DFS loop has to make about
which outgoing arc to explore next. But
I've given you four options for what the
finishing times of the nodes one, two,
three, all the way up to nine respectively
might be, and only one of these four could
conceivably be an output of the finishing
times of DFS loop on this graph. So, which one is it?
Alright. So the answer is the
fourth option. That is the only one of
these four sets of finishing times that
you might see being computed by DFS loop
on this blue graph. So let's go ahead and
trace through DFS loop and see how we
might get this set of finishing times. So
remember in the main loop we start from
the highest node, nine, and then we
descend down to the, to the lowest node,
one. So we start by invoking DFS from the
node nine. So now, from here there's only
one outgoing arc we have to go to, so we
mark nine as explored. And then there's
only one place we can go. We can go to six
so we mark six as explored. Now there's
two places we can go next. We can either
go to three or we can go to eight. And you
know, in general DFS could do either
one. Now to generate this fourth set of
finishing times, I'm going to need to
assume that I go to three first. Okay, so
again, what DFS does, what we're
assuming it does is it starts at nine,
then it has to go to six. It marks those
as explored. Then it goes to three. It
does not go to eight first. It goes to
three first. Now from three there's only
one outgoing arc which goes to nine but
nine you've already marked as explored. So
it's not going to re-explore nine, it's
going to skip that arc. Since that, that's
3's only outgoing arc, then that for loop completes.
And so three is the first node to finish.
So when we finish with three we increment
T. It started at zero, now it's one. And
we set the finishing time of three to be
one. Just like we said it was in the
example. So now we go back. Now we
backtrack to six. Now we have another
outgoing arc from six to explore, so now
we go to eight. From eight, we have to go
to two. From two, we have to go to five.
From five, we have to go to eight. Eight,
we've already seen, so then we're gonna be
done with five, because that was its only
outgoing arc. So then we increment T. Now
it's two, and the finishing time of five
is going to be two. As promised. So now we
backtrack to two. There's no more outgoing
marks from two. So 2's gonna be third one
that we finish, as promised. Then we
finish with eight. So the finishing time
for eight is gonna be the fourth node to
be done, as promised. Now we backtrack to
six, we're done with six. That's the fifth
node to be completed, as promised. And
finally we got all the way back to where
we started at nine and nine is the sixth
node to be completed, as promised. Now if
we were computing those leaders, all of
these nodes would get the leader nine. But
again the leaders are only relevant for
the second pass, so we're just going to
ignore the leaders as we're doing this
trace. We're just going to keep track of
the finishing times. So now we're not
done, so all we did is we finished with
the DFS that is invoked from the node nine
and we found six of the nodes total in
that depth-first search. So now we return
to the outer for loop and we decrement I,
so it started at nine, we're done with
that, now we go down to eight, we say that
we've already seen eight. Yes, eight's
already explored. So we skip it. We go, we
decrement I down to seven, we say have we
already seen node seven. No we have not.
Okay, seven is not yet explored, so we
invoke DFS now from node seven. Seven has
two outgoing arcs. It can either go to
four or it can go to nine. Let's say it
checks the outgoing arc to nine first. Now
nine we already explored. Granted that was
in an earlier part of the for loop, but
we remember that. We, we're going to
keep track of who got explored on previous
iterations of the for loop. So we don't
bother to re-explore nine. So we skip
that. So now from seven we have to go
to four. From four we have to go to one.
From one we have to go back to seven. 7's
already been explored, so we backtrack,
now we're down to, we're done with one. So
one is the next one we're completed with.
And the finishing time of one is going to
be seven, as promised. We backtrack to
four. There's no more outgoing arcs from
four to explore, so that's going to be the
eighth one to finish as promised. And then
the last one to finish is poor node seven.
It is last. So that would be an example of
how the DFS loop subroutine computes
finishing times on a reversed graph. So
now let's work through the second pass on
the forward version of the graph using the
same example. Now remember the point of
the first pass is to compute a magical
ordering. And the magical ordering is
these finishing times. So now we're going to
throw out the original node name and we're
going to replace the node names in blue by
the finishing times in red. We're also
going to work with the original graph
which means we have to reverse the arcs
back where they were originally. So, those
are the two changes that you're going to
see when I redraw this graph. First of
all, all the arcs will reverse
orientation. Second of all, all the nodes
will change names from their original
ones to the finishing times that we just
computed. So here's our new graph with the
new node names and all the arcs with their
orientation reversed. And now we run DFS
again on this graph. And again we're going
to process the nodes in order from the
highest label nine down to the lowest
label one. Moreover we don't need to
compute finishing times in the second
pass. We only need to do that in the first
pass. In the second pass we have to keep
track of the leaders. And remember the
leader of a vertex is the vertex from
which DFS was called. That first, that
first discovered that node. Alright, so
what's going to happen. Well, in the outer
for loop, we're going to start with I
equal to nine. And we invoke DFS from the
node nine. So that's gonna be the current
leader. Cause that's, where the current
DFS got initiated. Now from nine
there's only one choice. We have to go to
seven. From seven there's only one choice.
We have to go to eight. From eight there's
only one choice. We have to go back to
nine. And then nine's already been seen
so we backtrack. We go back to eight. We
go back to seven. We go back to nine and
that's it. So when we invoke DFS
from node nine the only things that we
encounter are the nodes seven, eight and
nine. And these are all going to be given
the leader vertex nine. You will notice
that this is indeed one of the strongly
connected components of the graph. We just
sort of found it with this invocation of
DFS from the node nine. So now we go back
to the outer for loop. And we say, okay,
let's go to node eight. Have you already
seen eight? Yes. What about seven? Have
you already seen seven? Yes. What about
six, have you already seen six? We have
not, we have not yet discovered six. So we
invoke DFS from node six, we reset
the global source vertex s to six. From
six we can go to nine or we can go to
one. So let's say we explore nine first.
Well we already saw nine in the earlier
iteration of the for loop so we don't
explore it again. So we don't discover
nine now. So we backtrack to six. We go to
one. From one we have to go to five. From
five we have to go to six and then we
start backtracking again. So the only new
nodes that we encounter when we invoke DFS
from the node six are the vertices six,
one, and five. And all of these will have
a leader vertex of six. Because that's
where we called DFS from, when we first
discovered these three nodes. And you'll
notice this is another SCC of this
directed graph. So we invoke DFS again
now from a new node, the new node six, and
what it discovered, the new nodes
discovered, is exactly an SCC of the
graph; nothing more, nothing less. So now
we return to the outer for loop. We go to
net five. Have we already seen five? Yes.
Have we already seen four? No. We haven't
seen four yet. So now we invoke DFS from
four. Again, we could try to explore five,
but we've already seen that before, we're
not going to explore it again. So from
four then, we have to go to two. From two,
we have to go to three. From three, we
have to go back to four, and then after
all the back tracking, we're done. So the
final call to DFS will be from the node
four. And that DFS will discover
precisely, newly discover precisely the
nodes 2,3 and 4. They will all have the
leader vertex four because that was where
this DFS was called from. It's true we
will go back to the for loop and we will
check, have we seen three yet, yes, have
we seen two yet, yes, have we seen one
yet, yes, and then the whole thing
completes. And what we see is that using
the finishing times computed from that
first depth-first search pass, somehow the
strongly connected components of this
graph just showed up and presented
themselves to us, one at a time, on a
silver platter. Every time we invoked DFS,
the nodes we discovered newly were
precisely one of the SCCs. Nothing more,
nothing less. And that's really what's
going on in this algorithm. It turns out
this is true in general. The first pass,
DFS on the reverse graph, computes
finishing times so that if you then
process nodes according to decrease ordering
of finishing times in the second
pass, each invocation to DFS will discover
one new SCC and exactly one SCC. So
they'll just present themselves to you one
per DFS call in that second pass's for
loop. This is, of course, merely an
example. You should not just take a single
example as proof that this algorithm
always works. I will give you a general
argument in the next video. But hopefully
there's at least a plausibility argument.
No longer does this three step algorithm
seem totally insane, and maybe you could
imagine perhaps it works. At least there's
some principles going on where you first
compute the right ordering to process
the nodes, and, and then the second pass
peels off SCCs one at a time like layers
from an onion. One thing that I hope is
pretty clear is that this algorithm
correct or not, is blazingly fast. Pretty
much all you do is two depth-first searches.
And since depth-first search, as we've seen
in the past runs in time linear in the
size of the graph so does Kosaraju's two-pass
algorithm. There are a couple of
subtleties and I encourage you to think
about this so you'll be forced to think
about this in the programming project for
week four. So for example, in the second
pass how do you process the nodes in
decreasing order of finishing time? You
don't want to sort the notes by their
finishing time cause that would take
n log n time, so you need to make sure that
you remember in the first pass that you
sort of remember the nodes in a way that
you can just do a linear scan through them
in the second pass. So there are some
details but if your intuition is that this
is really just double DFS properly
implemented that's pretty much exactly
right. So having spelled out the full
implementation, argued that it's
definitely a linear time algorithm, and
given at least a plausibility argument via
an example, that it might conceivably be
correct, let's now turn to the general argument.