1 00:00:02,840 --> 00:00:07,253 Now, we're going to look at depth-first search, which is a classical graph 2 00:00:07,253 --> 00:00:11,001 processing algorithm. It's actually maybe one of the oldest 3 00:00:11,001 --> 00:00:17,042 algorithms that we study, surprisingly. One way to think about depth first search, 4 00:00:17,042 --> 00:00:21,421 is in terms of mazes. It's a pretty familiar way to look at, 5 00:00:21,421 --> 00:00:25,354 look at it. And, so, if you have a maze like the one 6 00:00:25,354 --> 00:00:29,140 drawn on the left, you can model it with a graph. 7 00:00:29,353 --> 00:00:32,490 By creating a vertex for every intersection. 8 00:00:32,490 --> 00:00:36,482 In the maze. And an edge for every passage connecting 9 00:00:36,482 --> 00:00:38,265 two intersections. And. 10 00:00:38,479 --> 00:00:41,972 So. If you're at the entrance to this maze and 11 00:00:41,972 --> 00:00:47,390 you want to find a pot of gold somewhere. What you're gonna need to do is. 12 00:00:47,604 --> 00:00:51,667 Explore every intersection. Or explore, explore every edge. 13 00:00:51,881 --> 00:00:53,164 In the maze. So. 14 00:00:53,378 --> 00:00:57,442 We're gonna talk about the. Explore every intersection. 15 00:00:57,656 --> 00:00:59,723 Option. So that's our goal. 16 00:00:59,723 --> 00:01:06,176 To have an algorithm for doing that. By the way, this is a famous graph that 17 00:01:06,454 --> 00:01:12,840 some of you might recognize. That's the graph for the Pac-man game. 18 00:01:13,122 --> 00:01:22,256 Okay, so one method classic method that predates computers for exploring a maze is 19 00:01:22,256 --> 00:01:26,588 called the Tr maux maze exploration algorithm. 20 00:01:26,870 --> 00:01:31,955 The idea is to think about having a ball of string. 21 00:01:32,238 --> 00:01:39,769 And what you do is when you walk down a passage, you unroll the string behind you. 22 00:01:39,984 --> 00:01:43,791 And you also, mark, every place that you've been. 23 00:01:44,007 --> 00:01:48,101 So actually, I have a ball of string and some chalk, maybe. 24 00:01:48,101 --> 00:01:51,909 So in this case, maybe we walk down this passage here. 25 00:01:51,909 --> 00:01:55,716 And now we have some choices about where we might go. 26 00:01:55,931 --> 00:02:00,744 So say we go down here. So we unroll our ball of string and mark 27 00:02:00,744 --> 00:02:03,476 it. And so now, the next time, At this 28 00:02:03,476 --> 00:02:06,830 intersection, we have no choice but to go up here. 29 00:02:07,024 --> 00:02:10,443 We go up here and we say, oh, we've already been there. 30 00:02:10,637 --> 00:02:13,734 So we're not gonna go there. And we come back. 31 00:02:13,734 --> 00:02:18,701 And, we have our ball of string. So we can unroll it to figure out where we 32 00:02:18,701 --> 00:02:21,798 were. And we go back until we have some other 33 00:02:21,798 --> 00:02:24,314 choice. Which is this, this place, now. 34 00:02:24,508 --> 00:02:30,121 And we mark that we've been in these other places, And so now, we take another option 35 00:02:30,121 --> 00:02:34,508 and say, go down this way. And here, we take another option, go that 36 00:02:34,508 --> 00:02:38,385 way. And then finally again we go up this a way 37 00:02:38,385 --> 00:02:44,115 and we see that we've been there so we back up and take the last option and then 38 00:02:44,115 --> 00:02:48,135 that gets us to the last vertex in the graph. 39 00:02:48,135 --> 00:02:54,118 So mark each visited intersection and each visited package, passage, and retrace our 40 00:02:54,118 --> 00:03:01,393 steps when there's no unvisited option. Again this is a classical algorithm that 41 00:03:01,393 --> 00:03:10,203 was studied centuries ago and in fact some argued the first youth was when Theseus 42 00:03:10,203 --> 00:03:18,317 entered the Labyrinth and was trying to find the Minotaur and, And Rodney didn't 43 00:03:18,317 --> 00:03:23,775 want ''em to get lost in the maze. So she instructed Theseus to use a ball of 44 00:03:23,775 --> 00:03:29,034 string to find his way back out. That's the basic algorithm that we're 45 00:03:29,034 --> 00:03:32,955 gonna use. And has been studied by many, many 46 00:03:32,955 --> 00:03:39,993 scientists in the time since Theses. And in fact, Claude Shannon, founder of 47 00:03:39,993 --> 00:03:46,855 Information Theory, did experiments on mazes with mice to see if they might 48 00:03:46,855 --> 00:03:50,990 understand maze exploration, this might help. 49 00:03:51,316 --> 00:03:57,080 Okay, so here's what it look like in its typical maze. 50 00:03:57,752 --> 00:04:04,487 Now one of the things to remember is in a computer representation normally we're 51 00:04:04,487 --> 00:04:09,210 just looking at the vertices and the set of associated edges. 52 00:04:09,392 --> 00:04:14,098 We don't see anything other than that. Though, it's sometimes frustrating 53 00:04:14,098 --> 00:04:19,584 watching me you know that it turned the wrong way and it's gonna get trapped here. 54 00:04:19,584 --> 00:04:24,812 But, the computer doesn't really know that, so it has to back up along here now 55 00:04:25,006 --> 00:04:30,170 and it continues to back up to find another option untill it gets free again. 56 00:04:30,399 --> 00:04:35,215 And finds a some place to go. Sometimes its very frustrating. 57 00:04:35,215 --> 00:04:41,332 Its seems to be quite close to the goal like appear and it turns a wrong way. 58 00:04:41,332 --> 00:04:48,060 So we an see its gonna take a long way but no way the program could really know that. 59 00:04:48,060 --> 00:04:55,588 Again, all the programs we're working with is vertex instead of edges associated with 60 00:04:55,588 --> 00:04:59,641 that vertex and there it finally get to the goal. 61 00:04:59,890 --> 00:05:13,721 Here's a bigger one going faster. The king thing is not so much, its getting 62 00:05:13,721 --> 00:05:18,590 lost and the key thing is not going anywhere twice. 63 00:05:18,786 --> 00:05:23,234 And that's the whole thing. We have to have the string to know to go 64 00:05:23,234 --> 00:05:27,617 back where we came from. And we have to be able to mark where we 65 00:05:27,617 --> 00:05:30,627 have been. And with those two things we are, 66 00:05:30,627 --> 00:05:34,160 algorithm is, able to avoid going the same place twice. 67 00:05:34,357 --> 00:05:40,031 If you weren't marking, if you tried to do this randomly or some other way it might 68 00:05:40,031 --> 00:05:45,111 take you a while to get to the goal. So it doesn't seem like much of 69 00:05:45,111 --> 00:05:50,587 accomplishment maybe for a maze but actually to be able to get there with 70 00:05:50,785 --> 00:05:56,260 going, without going any place thrice, twice is sort of a, profound idea and 71 00:05:56,458 --> 00:05:59,239 leads to an efficient algorithm. Okay. 72 00:05:59,239 --> 00:06:07,358 So, our idea is, given in this, medicode, to do, depth research, that is, to, visit, 73 00:06:07,609 --> 00:06:11,459 all the places you can get to from a vertex, V. 74 00:06:11,710 --> 00:06:16,565 What we're gonna do is this simple re, recursive algorithm. 75 00:06:16,565 --> 00:06:23,344 Mark the vertex as visited and then recursively visit all unmarked vertices, W 76 00:06:23,344 --> 00:06:28,493 that are adjacent to V. That's a very simple description, and it 77 00:06:28,493 --> 00:06:33,973 leads to very simple codes. It's so simple actually, it really belies 78 00:06:33,973 --> 00:06:37,520 the profound idea underneath this algorithm. 79 00:06:37,727 --> 00:06:43,735 So, again, There's lots of applications. And, for example, this is one way to find 80 00:06:43,735 --> 00:06:47,120 whether there exists a path between two vertices. 81 00:06:47,327 --> 00:06:51,678 Or to find all the vertices connected to a given source vertex. 82 00:06:51,678 --> 00:06:57,134 And we'll consider some less abstract applications, once we've looked at the 83 00:06:57,134 --> 00:06:59,779 code. So, so how to implement. 84 00:07:00,017 --> 00:07:06,756 Well here's what we're gonna do for our design pattern for graph processing. 85 00:07:06,756 --> 00:07:14,350 It's our first example. So what we did, when we defined an API for 86 00:07:14,350 --> 00:07:21,510 graph was to decouple the graph data type from graph processing. 87 00:07:21,510 --> 00:07:27,174 The idea is we're gonna create a graph object using that API which we know allows 88 00:07:27,174 --> 00:07:30,489 us to represent a big graph within the computer. 89 00:07:30,489 --> 00:07:35,808 And gives us the basic operations that we're gonna need for graph processing. 90 00:07:35,808 --> 00:07:39,883 And then we use that API within a graph processing routine. 91 00:07:39,883 --> 00:07:45,340 And the basic idea is that, that graph, graph processing routine will go through 92 00:07:45,340 --> 00:07:50,728 the graph and collect some information. And then a client of that routine will 93 00:07:50,728 --> 00:07:55,080 query the it's API to get information about the graph. 94 00:07:55,080 --> 00:08:02,339 So in the case of, depth first search. Here's a potential possible API. 95 00:08:02,618 --> 00:08:10,716 So the idea is that what this, what we're gonna implement is a program that can find 96 00:08:10,716 --> 00:08:17,420 paths in a graph from a given source. So we give a graph and a vertex. 97 00:08:17,664 --> 00:08:24,275 And that the constructor is gonna do what it needs in order to be able to answer, 98 00:08:24,519 --> 00:08:28,926 these two queries. First one is, give a vertex, 99 00:08:28,926 --> 00:08:34,394 Client will give a vertex. Is there a path in the graph, from the 100 00:08:34,394 --> 00:08:39,065 source to that vertex? You wanna be able to, answer that 101 00:08:39,065 --> 00:08:40,834 efficiently. And then, 102 00:08:40,834 --> 00:08:47,249 The other thing is to just give the path. What's the path from, has to be giving all 103 00:08:47,249 --> 00:08:51,894 the vertices on the path, in time proportional to its length. 104 00:08:52,115 --> 00:08:58,456 So here's a client of, this, API. So, it's gonna take a source, a source 105 00:08:58,456 --> 00:09:02,554 vertex S. And it's gonna build a pathfinder, or a 106 00:09:02,554 --> 00:09:07,397 path object. And that object is gonna do the processing 107 00:09:07,397 --> 00:09:13,913 it needs to be able to efficiently implement hasPathTo. And then what this 108 00:09:13,913 --> 00:09:20,900 does is for every vertex in the graph. If there's a path from s to that vertex. 109 00:09:21,074 --> 00:09:24,157 It'll print it out. So it prints out all the vertices 110 00:09:24,157 --> 00:09:29,108 connected to x. And that's just one client, of this, data 111 00:09:29,108 --> 00:09:33,035 type. You could, print out the pass or whatever 112 00:09:33,035 --> 00:09:37,415 else you might. So that's our design pattern that we're 113 00:09:37,415 --> 00:09:42,001 gonna use over and over again, for, A graph processing routine. 114 00:09:42,001 --> 00:09:46,712 And it's important to understand why we use a design pattern like this. 115 00:09:46,712 --> 00:09:51,224 We're decoupling the graph representation from the processing of it. 116 00:09:51,224 --> 00:09:56,333 As I mentioned, there's hundreds of routines for, or algorithms that have been 117 00:09:56,333 --> 00:10:00,978 developed for processing graphs. An alternative might be to put all of 118 00:10:00,978 --> 00:10:06,500 those algorithms in one big data type. That's a so called fat interface. 119 00:10:06,714 --> 00:10:12,714 And that would be a, a bad plan, cuz these things maybe are not so well related to 120 00:10:12,714 --> 00:10:17,072 each other. And actually all of them really are just 121 00:10:17,286 --> 00:10:22,286 iterating through the graph, and doing different types of processing. 122 00:10:22,286 --> 00:10:28,067 With this way we're able to separate out. The, and I articulate what the graph 123 00:10:28,067 --> 00:10:32,599 processing clients are doing. And then the real applications can be 124 00:10:32,599 --> 00:10:35,711 clients, of these graph processing routines. 125 00:10:35,914 --> 00:10:41,460 And everybody's taken advantage of an efficient representation, that we already 126 00:10:41,460 --> 00:10:42,925 took care of. Okay. 127 00:10:42,925 --> 00:10:49,220 So now let's look at a demo of how depth-first search is gonna work and then 128 00:10:49,220 --> 00:10:52,734 we'll take a look at the implementation. Okay. 129 00:10:52,734 --> 00:10:58,663 So here's a demo of depth-first search in operation on our sample graph. 130 00:10:58,882 --> 00:11:04,592 Again, to visit a vertex we're gonna mark it, and then recursively visit all 131 00:11:04,592 --> 00:11:07,740 unmarked verti-, vertices that are adjacent. 132 00:11:07,987 --> 00:11:14,168 So this is a sample graph. And so the first thing we do is realize 133 00:11:14,168 --> 00:11:21,090 that we're gonna need a vertex index array to keep track of which vertices are more. 134 00:11:21,090 --> 00:11:28,590 So that will just be an array of bullions and we'll initialize that with all false. 135 00:11:28,590 --> 00:11:32,871 We're also gonna keep another data structure. 136 00:11:32,871 --> 00:11:40,365 A, a vertex indexed array of ints. That for every vertex gives us the vertex 137 00:11:40,365 --> 00:11:45,870 that took us there. So, let's get started and you'll see how 138 00:11:45,870 --> 00:11:49,992 it works. So this is depth-first search staring at 139 00:11:49,992 --> 00:11:54,659 vertex zero. So now to visit vertex zero, we wanna mark 140 00:11:54,659 --> 00:12:00,570 it so that's, our mark zero is true. That's the starting points we know 141 00:12:00,570 --> 00:12:04,770 anything with Edge two. And now what we're gonna do is. 142 00:12:04,770 --> 00:12:10,044 We're going to need to check all the vertices that are adjacent to zero. 143 00:12:10,044 --> 00:12:15,465 So that's six, two, one, and five. The order in which they're checked depends 144 00:12:15,465 --> 00:12:20,886 on the representations in the bag. We don'tr really, necessarily care about 145 00:12:20,886 --> 00:12:24,329 that. Most of the algorithms are going to check 146 00:12:24,329 --> 00:12:28,065 them all. And it doesn't matter that much about the 147 00:12:28,065 --> 00:12:31,361 order. Although, in some cases it's wise to be 148 00:12:31,361 --> 00:12:35,097 mindful. And maybe use a bag that takes them out in 149 00:12:35,097 --> 00:12:38,046 random order. Okay this is zero. 150 00:12:38,046 --> 00:12:44,451 Now we have to check, six, two, one, and five so, lets go ahead and do that. 151 00:12:44,451 --> 00:12:49,276 So in this case six, six is the first thing to get checked. 152 00:12:49,525 --> 00:12:56,762 And so now, we mark, six is visited so now we are going to recursively do a search 153 00:12:56,762 --> 00:13:01,466 starting from six. The other difference when we visit six 154 00:13:01,466 --> 00:13:05,665 from zero. We're going to put a zero in this edge to 155 00:13:05,665 --> 00:13:11,926 entry to say that when we first got the six the way we got there, was from zero. 156 00:13:11,926 --> 00:13:18,187 And that's going to be the data structure that'll help us, implement the client 157 00:13:18,187 --> 00:13:22,310 query and give us the path back to zero from any path. 158 00:13:22,310 --> 00:13:26,317 From any vertex. So okay what do we have to do to visit 159 00:13:26,524 --> 00:13:29,564 six. Well six has two adjacent vertices zero 160 00:13:29,564 --> 00:13:32,535 and four. So we're gonna have to check them. 161 00:13:32,742 --> 00:13:36,127 So first we check zero and that's already marked. 162 00:13:36,334 --> 00:13:42,000 So we don't really have to do anything. We're only suppose to recursively visit 163 00:13:42,000 --> 00:13:45,040 unmarked vertices. And then we check four. 164 00:13:45,040 --> 00:13:50,510 And four is unmarked, so we're going to have to recursively visit is. 165 00:13:50,510 --> 00:13:55,981 The next thing we do is visit four. Mark four as having been visited. 166 00:13:55,981 --> 00:14:01,855 Where the true and the marked array, Fourth entry is a marked array. 167 00:14:01,855 --> 00:14:06,280 And we fill an edge two saying we got to four from six. 168 00:14:06,525 --> 00:14:13,566 And so now to visit four, we have to recursively check five, six and three, and 169 00:14:13,566 --> 00:14:18,070 again, that order is where they happen to be in our bag. 170 00:14:18,295 --> 00:14:23,719 So first we check five. Five is not marked so we're going to visit 171 00:14:23,719 --> 00:14:25,602 five. We're gonna mark it. 172 00:14:25,602 --> 00:14:31,477 Say we got there from four, and then go ahead and visit three, four and zero, in 173 00:14:31,477 --> 00:14:34,716 that order. From first we visit three. 174 00:14:34,942 --> 00:14:40,064 That one also is not yet marked, so we're gonna recursively visit it. 175 00:14:40,290 --> 00:14:45,261 So it's marked three. Say we got there from five and then go 176 00:14:45,261 --> 00:14:50,610 ahead and to visit three recursively, we have to check five and four. 177 00:14:50,796 --> 00:14:54,096 Check five. Well we just came here it's mark, so we 178 00:14:54,096 --> 00:14:58,640 don't have to do anything. Check four, that's also, been marked so we 179 00:14:58,640 --> 00:15:03,060 don't have to do anything. So now finally, this is the first time and 180 00:15:03,060 --> 00:15:08,165 that requires a call that we're ready to return, we're done with that first search 181 00:15:08,165 --> 00:15:11,393 from three. So now we're done with three. 182 00:15:11,393 --> 00:15:18,150 And we can unwinding the recursion, we can now continue our search from five. 183 00:15:18,150 --> 00:15:23,218 And the next thing we have to do from five, we had already checked three, so now 184 00:15:23,218 --> 00:15:27,113 we're gonna check four. And we've already visited four, so we 185 00:15:27,113 --> 00:15:30,080 don't have to do anything. That's already marked. 186 00:15:30,266 --> 00:15:33,356 And we checked zero, and that one's already marked. 187 00:15:33,356 --> 00:15:38,240 So now we're done with five, and we can back one more level up in the recursion. 188 00:15:38,425 --> 00:15:42,752 So now, for four, we have to go through, and look at six and three. 189 00:15:42,938 --> 00:15:46,214 Six is mar, marked, so we don't have to do anything. 190 00:15:46,214 --> 00:15:51,530 Three is marked, so we don't have to do anything, and so we're gonna be done with 191 00:15:51,530 --> 00:15:55,684 four. So that after finishing four we're done 192 00:15:55,684 --> 00:16:00,391 with six. And so now we're in the recursion back at 193 00:16:00,391 --> 00:16:03,607 zero. And we've already checked six. 194 00:16:03,607 --> 00:16:09,349 So now we gotta check two next. We checked two, and so we rehearse and go 195 00:16:09,349 --> 00:16:13,034 there. Mark two, and then say we got there from 196 00:16:13,034 --> 00:16:19,126 zero, and now to visit two, all we check is zero and that's a marks, so we don't 197 00:16:19,126 --> 00:16:22,435 have to do anything, and we're done with two. 198 00:16:22,660 --> 00:16:27,624 Then check one, visit one, that's the last vertex we're visiting. 199 00:16:28,057 --> 00:16:32,073 Check zero, it's already marked so we don't do anything. 200 00:16:32,281 --> 00:16:36,436 We return. Now, we're at the last, step is to, from 201 00:16:36,436 --> 00:16:40,798 zero, five is on it's list, we have to check if we've been there. 202 00:16:41,006 --> 00:16:46,130 We can see that it's marked and we have been there so we're one with zero. 203 00:16:46,415 --> 00:16:55,649 So that's a depth-first search from vertex zero, and we have visited all the vertices 204 00:16:55,649 --> 00:17:01,700 that are reachable from zero. Number one and number two for each one of 205 00:17:01,700 --> 00:17:06,014 those vertexes we kept track of how we got there from zero. 206 00:17:06,014 --> 00:17:12,083 So if we now want to know for any one of those vertices how to get back to zero we 207 00:17:12,083 --> 00:17:18,005 have the information that we need. For example say we want to find the path 208 00:17:18,225 --> 00:17:23,197 from five back to zero. We know we got the five from four, we know 209 00:17:23,197 --> 00:17:28,973 we got the four from six, we know we got the six from zero so we can go back 210 00:17:28,973 --> 00:17:35,155 through using that edge to array to find. The path, so the depth for search 211 00:17:35,155 --> 00:17:41,563 calculation built in data structures, and now clients, whose data structures built 212 00:17:41,563 --> 00:17:48,250 in a constructor serve as the basis for, being able to. 213 00:17:48,250 --> 00:17:53,760 Officially answer client queries. That's a depth-first search demo. 214 00:17:53,760 --> 00:17:59,247 So, this is just a summary of the thing I talked about, during that demo. 215 00:17:59,461 --> 00:18:04,450 Our goal is to find all the vertices connected to different vertex at. 216 00:18:04,450 --> 00:18:09,570 And also a path, in order to be able to answer client query. 217 00:18:09,570 --> 00:18:15,117 On the algorithm we're going to use is based on like maze exploration where we 218 00:18:15,117 --> 00:18:20,393 use excursion, mark each vertex, keep track of the edge we took to visit it and 219 00:18:20,596 --> 00:18:26,843 return when there's no unvisited options. We're using, two data structures, to 220 00:18:26,843 --> 00:18:30,655 implement this. Both vertex indexed arrays. 221 00:18:30,909 --> 00:18:36,161 One named Mark that will tell us which vertices we've been to. 222 00:18:36,161 --> 00:18:41,413 And another one, edge two that maintains that tree of paths. 223 00:18:41,667 --> 00:18:48,190 Where edge 2W = V means that VW was taken, the first time that we went to W. 224 00:18:48,190 --> 00:18:52,844 So now, let's look at the code, that, given all of this background. 225 00:18:53,062 --> 00:18:57,935 The code for implementing depth first search is remarkably compact. 226 00:18:58,153 --> 00:19:02,953 So here's our private instance variables. The marked and edgedTo vertex and mix 227 00:19:02,953 --> 00:19:08,117 arrays, and the source s. And the constructor just goes through and, 228 00:19:08,335 --> 00:19:13,644 creates, the arrays and initializes them. And we won't repeat that code. 229 00:19:13,862 --> 00:19:20,334 And so here's the, the last thing the constructor does after it creates the 230 00:19:20,334 --> 00:19:25,197 arrays, is does a DFS on the graph, from the given source. 231 00:19:25,197 --> 00:19:31,213 And it's a remarkably, compact implementation to do depth-first search, 232 00:19:31,460 --> 00:19:36,569 from a vertex V. What we do is mark V, let's say mark it 233 00:19:36,569 --> 00:19:39,701 true. Then for everybody adjacent to V. 234 00:19:39,948 --> 00:19:45,387 We check if it's marked. If it's not marked, then we do a recursive 235 00:19:45,387 --> 00:19:49,649 call. And we set edge to w equals v. 236 00:19:49,970 --> 00:19:56,280 Again remarkably compact code that gets the job done. 237 00:19:57,406 --> 00:20:02,388 So now let's look at some of the properties of depth-first search. 238 00:20:02,615 --> 00:20:09,409 So first thing is we wanna be sure that convince ourselves that it marks all the 239 00:20:09,409 --> 00:20:14,919 vertices connected to S in time proportional to some of their degrees, 240 00:20:14,919 --> 00:20:18,845 well, depth-first graph is going to be small. 241 00:20:19,071 --> 00:20:24,960 So s-, so first thing is, convince yourself that if you mark the vertex, then 242 00:20:24,960 --> 00:20:28,810 there has to be a way to get to that vertex from S. 243 00:20:29,036 --> 00:20:34,779 So well that's easy to see, because the only way to mark vertex is get there 244 00:20:34,779 --> 00:20:40,976 through a sequence of recursive calls, and every recursive calls corresponds to an 245 00:20:40,976 --> 00:20:46,344 edge on a path from SVW. But you also have to be able to show that 246 00:20:46,344 --> 00:20:50,215 you get to, every vertex that's connected to S. 247 00:20:50,215 --> 00:20:56,692 And that's a little more intricate. And this diagram is, supposed to help you 248 00:20:56,692 --> 00:21:02,379 out in understanding that. If you had, some unmarked vertex, Then, 249 00:21:02,616 --> 00:21:06,250 maybe there's, a bunch of unmarked vertices. 250 00:21:06,250 --> 00:21:10,130 And so... And it's connected to S and it's not 251 00:21:10,130 --> 00:21:17,047 marked, so that means there has to be an edge on a path from S to W, that goes from 252 00:21:17,047 --> 00:21:22,604 a marked vertex to an unmarked one. But the design of the algorithm says that 253 00:21:22,604 --> 00:21:27,407 there's no such edge if you're on a marked vertex then you're gonna go through and 254 00:21:27,407 --> 00:21:31,690 look at all the adjacent ones and if it's not marked, you're gonna mark it. 255 00:21:31,690 --> 00:21:36,965 So that's an outline of the proof that DFS marks all the vertices. 256 00:21:36,965 --> 00:21:43,199 And in the running time, it only visits each marked vertex once or each vertex 257 00:21:43,199 --> 00:21:47,515 connected as once. And so, for each one of them, it goes 258 00:21:47,515 --> 00:21:52,951 through all the adjacent vertices. So that's the basic properties of 259 00:21:52,951 --> 00:21:58,323 depth-first search. So now, the other thing that is important 260 00:21:58,323 --> 00:22:04,894 is that a client who has, uses this algorithm after the depth-first search, 261 00:22:04,894 --> 00:22:11,554 after the constructor has done the depth-first search and built these data 262 00:22:11,554 --> 00:22:16,260 structures, Client can find the vertices connected to 263 00:22:16,260 --> 00:22:22,298 the source in constant time. And can find a path, 2S if one exists in 264 00:22:22,298 --> 00:22:29,459 time proportional to its length. Well, the marked array provides the first 265 00:22:29,459 --> 00:22:35,811 part. And the second part is Just the property 266 00:22:35,811 --> 00:22:41,781 of the edge to array. It's a, what's called a parent link 267 00:22:41,781 --> 00:22:49,774 representation of a tree rooted at S. So if a vertex is connected to S then its 268 00:22:49,774 --> 00:22:54,530 edge two is parent in a tree. So this code here. 269 00:22:54,754 --> 00:23:00,952 Is going to for a given, well has path too so that's just return mark, that's the 270 00:23:00,952 --> 00:23:04,911 first part. And then to actually get the path to a 271 00:23:04,911 --> 00:23:08,869 given vertex so, here's the code for doing that. 272 00:23:09,093 --> 00:23:15,516 We actually use a stack to keep track of the path'cause we get it in reverse order. 273 00:23:15,740 --> 00:23:22,291 If there's no path, we return null. Otherwise we keep a variable X and we just 274 00:23:22,291 --> 00:23:29,212 follow up through the edge to array Pushing the vertex on to the stack and 275 00:23:29,212 --> 00:23:35,121 then moving up the tree in the ray, then finally push, push as itself on to the 276 00:23:35,121 --> 00:23:40,737 path and then we have a stack which is edible which will give us our path. 277 00:23:40,956 --> 00:23:47,302 So that's in time, time proportional to the length of the path and forth while to 278 00:23:47,302 --> 00:23:53,649 check your understanding of how stacks in real works, irreversible to take a look at 279 00:23:53,649 --> 00:24:00,730 this code to see that it does the job. So that's depth-first search. 280 00:24:01,114 --> 00:24:06,092 Now it's not. Of the optimal graph searching method for 281 00:24:06,092 --> 00:24:10,700 all applications. And here's an, an amusing representation 282 00:24:10,700 --> 00:24:15,028 of how depth first search can maybe create problems sometimes. 283 00:24:15,028 --> 00:24:19,566 So, I'm getting ready for a date, what situations do I prepare for? 284 00:24:19,776 --> 00:24:23,266 Well, medical emergency, dancing, food too expensive. 285 00:24:23,476 --> 00:24:27,246 Okay, what kind of medical emergencies could happen? 286 00:24:27,246 --> 00:24:32,272 Well, I could be snake bite or a lightning strike or a fall from a chair. 287 00:24:32,272 --> 00:24:37,160 Well, what about snakes, I have to worry about corn snakes or garder. 288 00:24:37,160 --> 00:24:41,432 Say for copperhead. And then well, I better make a straight. 289 00:24:41,432 --> 00:24:45,985 I better study snakes. And then the date says, I'm here to pick 290 00:24:45,985 --> 00:24:47,806 you up. You're not dressed. 291 00:24:48,016 --> 00:24:52,149 And well, so I really need to stop using depth-first search. 292 00:24:52,359 --> 00:24:56,702 So we're gonna look at other graph searching algorithms. 293 00:24:56,702 --> 00:25:02,936 But if you always try to expand the next thing that you come to, that's depth-first 294 00:25:02,936 --> 00:25:06,228 search. And there's a lot of natural, situations 295 00:25:06,228 --> 00:25:10,817 where that naturally comes to mind. Here's another example. 296 00:25:11,039 --> 00:25:16,895 I took this photo of the Taj Mahal a couple of years ago and I didn't like the 297 00:25:16,895 --> 00:25:21,416 color of the sky. So I used Photoshop's magic wand to make 298 00:25:21,416 --> 00:25:25,713 it more blue. And the implementation, now this is a huge 299 00:25:25,713 --> 00:25:28,972 graph. This picture's got millions of pixels. 300 00:25:28,972 --> 00:25:34,454 And the way that the flood filled the magic wand works, is to build, from a 301 00:25:34,454 --> 00:25:40,231 photo, what's called a grid graph, where every vertex is a pixel and every edge 302 00:25:40,231 --> 00:25:45,639 connects two pixels that are the same color, approximately the same color. 303 00:25:45,639 --> 00:25:51,342 And it builds a blob of all the pixels that have the same color as the given 304 00:25:51,342 --> 00:25:54,380 pixel. So when I click on one, it does the 305 00:25:54,380 --> 00:26:00,004 depth-first search to find all. All the connected pixels and therefore 306 00:26:00,004 --> 00:26:06,346 change them to the new color that's a fine example of depth per search on a huge 307 00:26:06,346 --> 00:26:11,696 graph that people use everyday. So that's our first nontrivial graph 308 00:26:11,696 --> 00:26:14,460 processing algorithm depth-first search.