1 00:00:06,670 --> 00:00:10,391 In this video, I'm going to design one more function operating on arbitrary 2 00:00:10,391 --> 00:00:13,980 arity trees, so much of this will be familiar. 3 00:00:15,470 --> 00:00:19,069 But the design of this function introduces a new and important concept 4 00:00:19,069 --> 00:00:22,780 called backtracking search. So, let's go find the problem. 5 00:00:22,780 --> 00:00:28,000 It's in fs-v3.racket. And we'll scroll down here. 6 00:00:28,000 --> 00:00:35,192 And we're supposed to design a function that consumes string and element. 7 00:00:35,192 --> 00:00:38,900 Let me write that down. String element. 8 00:00:38,900 --> 00:00:41,788 And looks for a data element with the given name. 9 00:00:41,788 --> 00:00:45,600 If it finds that element it produces the data. 10 00:00:45,600 --> 00:00:50,496 Otherwise it produces false. It produces the data, let's see what that 11 00:00:50,496 --> 00:00:54,413 means. So, I'm looking in one of these trees for 12 00:00:54,413 --> 00:01:00,222 an element with a given name. Alright, the string is the name so that's 13 00:01:00,222 --> 00:01:04,657 why I got the string as an argument. I'm looking for an element with the name 14 00:01:04,657 --> 00:01:08,528 equal to the sting I was given. And if I find it I produce the data, 15 00:01:08,528 --> 00:01:11,833 which is an integer. So, going back down here to where we 16 00:01:11,833 --> 00:01:15,651 were, we're going to produce integer if we find it. 17 00:01:15,651 --> 00:01:19,707 And otherwise we produce false. So, that's different. 18 00:01:19,707 --> 00:01:26,100 That's something we've never done before. What's happening here is this function is 19 00:01:26,100 --> 00:01:29,200 going to go look for an element with the given name and sometimes it's going to 20 00:01:29,200 --> 00:01:33,045 produce the data, otherwise it's going to produce false. 21 00:01:33,045 --> 00:01:36,931 Well, written functions that say whether a given string in a listed string, that 22 00:01:36,931 --> 00:01:41,060 that function always produces a Boolean, true or false? 23 00:01:41,060 --> 00:01:44,708 Here, we're going to produce the data if we find it or false otherwise and we need 24 00:01:44,708 --> 00:01:50,270 a little extension to how we write function signatures to do this. 25 00:01:50,270 --> 00:01:54,526 And what you're going to do is you're going to say, whatever type it would 26 00:01:54,526 --> 00:01:59,302 produce if it succeeds or false. So, this is a special form of signature 27 00:01:59,302 --> 00:02:01,850 that's particularly applicable to backtracking search. 28 00:02:04,360 --> 00:02:07,880 Now, in some programming languages, in most recent programming languages, 29 00:02:07,880 --> 00:02:11,620 there's a more sophisticated mechanism called exceptions for dealing with this 30 00:02:11,620 --> 00:02:16,570 kind of case. To keep BSL simple, BSL doesn't have 31 00:02:16,570 --> 00:02:21,076 exceptions. But the basic way we're doing 32 00:02:21,076 --> 00:02:26,592 backtracking here can easily be adapted to a language that does have exceptions. 33 00:02:26,592 --> 00:02:33,410 So let's see. Search a given tree for an element with 34 00:02:33,410 --> 00:02:45,527 the given, given, given name. Produce data if found false otherwise. 35 00:02:45,527 --> 00:02:50,676 Let's do some stubs. We're operating on two mutually 36 00:02:50,676 --> 00:02:54,105 referential types. So, there's going to be two functions. 37 00:02:54,105 --> 00:03:01,133 Define fine, dash dash element. Which will consume an element. 38 00:03:01,133 --> 00:03:05,650 But it will also consume a name. Some string. 39 00:03:05,650 --> 00:03:07,760 And for the stub, we'll just assume it always fails. 40 00:03:07,760 --> 00:03:11,976 It'll produce false. And define find, dash, dash. 41 00:03:11,976 --> 00:03:17,696 List of element, which will consume a list of element, but it will also consume 42 00:03:17,696 --> 00:03:25,348 the name that it's looking for, false. Okay, so those are the stubs and we 43 00:03:25,348 --> 00:03:31,283 should put the signature for the second function. 44 00:03:31,283 --> 00:03:40,670 ListOfElement, string and list of element. 45 00:03:40,670 --> 00:03:45,540 And again we never quite now here for sure, but let's put integer or false. 46 00:03:45,540 --> 00:03:53,356 Okay, let's do some examples. Let's start with the base case, we know 47 00:03:53,356 --> 00:03:57,708 that in this case, the base case is the -loe function called with empty for the 48 00:03:57,708 --> 00:04:02,591 loe. That's where the mututal reference and 49 00:04:02,591 --> 00:04:09,220 self-reference cycles all stop. So, we'll say check, expect. 50 00:04:09,220 --> 00:04:13,965 Find-loe, let's say we're looking for F3 in empty and of course we can't find F3 51 00:04:13,965 --> 00:04:20,740 in empty because it's not anything in empty at all so that'll produce false. 52 00:04:20,740 --> 00:04:24,915 Now, let's do another simple case. Let's say, check, expect, let's look It's 53 00:04:24,915 --> 00:04:28,970 find dash dash loe and find dash dash element. 54 00:04:28,970 --> 00:04:40,980 Let's say we're looking for something named F3 in F1. 55 00:04:40,980 --> 00:04:46,600 Well, that's false. There's nothing named F3 in F1. 56 00:04:46,600 --> 00:04:50,415 And most of these other tests are going to be find an element. 57 00:04:50,415 --> 00:04:52,680 So, we'll make that something we could copy. 58 00:04:52,680 --> 00:04:56,898 But what if we look for F3 and F3? Well, if we look for F3 and F3, we do 59 00:04:56,898 --> 00:05:00,586 find it. We find it right away, 'cuz F3 is F3, and 60 00:05:00,586 --> 00:05:03,660 so we return the data, which is three. Okay. 61 00:05:03,660 --> 00:05:11,785 So, that's the base case, and two very simple, immediate success or immediate 62 00:05:11,785 --> 00:05:22,050 failure cases, let's try to do another case, let's look for F3 in D4. 63 00:05:22,050 --> 00:05:26,105 Well, that's false we don't find F3 in D4. 64 00:05:26,105 --> 00:05:33,341 But what about looking for F1 in D4. Well, we do find that F1 is the first 65 00:05:33,341 --> 00:05:38,240 child of D4. So, this is going to produce 1. 66 00:05:38,240 --> 00:05:41,642 But what about a case where what we're looking for is not the first child of D4 67 00:05:41,642 --> 00:05:46,874 but one of the later childs of D4. We should try that case because it kind 68 00:05:46,874 --> 00:05:50,636 of corresponds to searching in a list and making sure we search beyond the first 69 00:05:50,636 --> 00:05:55,616 element. So, if I look for F2 in D4, I'm going to 70 00:05:55,616 --> 00:06:01,738 find, so the data will be 2. Seems like we should also try a case 71 00:06:01,738 --> 00:06:06,040 where we're looking in an element that has some. 72 00:06:06,040 --> 00:06:09,319 But what we're looking for is actually the element itself. 73 00:06:09,319 --> 00:06:16,610 So, let's see what happens when we look for D4 and D4. 74 00:06:16,610 --> 00:06:26,290 Well, anything that has subs has data 0. So, this is going to produce 0. 75 00:06:26,290 --> 00:06:34,830 And let's try one more big one, find--element. 76 00:06:34,830 --> 00:06:42,325 Let's try looking for F3 in D6. That's the far-right child. 77 00:06:42,325 --> 00:06:48,173 And that would produce three. So, that seems like a pretty exhaustive 78 00:06:48,173 --> 00:06:52,236 set of tests. Let's run those tests to make sure 79 00:06:52,236 --> 00:06:56,020 they're well formed. They're all running and most are failing 80 00:06:56,020 --> 00:06:59,380 so they are well formed and let's save that. 81 00:06:59,380 --> 00:07:07,443 Okay, now let's go get the template. There's 2 of them of course, here they 82 00:07:07,443 --> 00:07:11,630 are. We'll go back down to the function that 83 00:07:11,630 --> 00:07:17,730 we're working on, we'll come in at the two stubs, placed in the templates, 84 00:07:17,730 --> 00:07:26,831 uncomment those. Now, there the remaining find--element is 85 00:07:26,831 --> 00:07:34,375 that and here is the natural mutual recursion. 86 00:07:34,375 --> 00:07:37,450 Copy that. Here's the function name. 87 00:07:37,450 --> 00:07:42,235 Here's the natural recursion and here's the natural mutual recursion. 88 00:07:42,235 --> 00:07:47,044 And there's one more thing to do here. These are functions that can consume two 89 00:07:47,044 --> 00:07:50,270 arguments. They consume the element or the list of 90 00:07:50,270 --> 00:07:54,970 element and also the name of the element we're looking for. 91 00:07:54,970 --> 00:08:01,261 So, we have to add a parameter n there. And we have to add a parameter n there. 92 00:08:01,261 --> 00:08:06,152 And remember, I said whenever you have a recursion and it also applies for mutual 93 00:08:06,152 --> 00:08:10,428 recursion. When you add a parameter, you should be 94 00:08:10,428 --> 00:08:14,322 sure to go put a note to yourself at each recursive or mutually recursive call to 95 00:08:14,322 --> 00:08:20,220 say, hey, you're going to need something here, you're going to need a parameter. 96 00:08:20,220 --> 00:08:24,880 Now, we could wait till later to fill in those question marks. 97 00:08:24,880 --> 00:08:28,906 But in this case, I know that once I start looking for given n, the whole way 98 00:08:28,906 --> 00:08:33,500 down the tree, I'm going to be looking for the same n. 99 00:08:33,500 --> 00:08:38,702 So, n is never going to change, n is always the same value. 100 00:08:38,702 --> 00:08:44,210 So, I'm just going to keep passing n in all the recursive calls. 101 00:08:44,210 --> 00:08:46,448 I could feel it in now. Now, if that's not clear to you or if you 102 00:08:46,448 --> 00:08:49,220 don't think it would have clear to you this early, that's fine, you can leave 103 00:08:49,220 --> 00:08:53,926 the question mark and fill them in later. I can just tell now that this function 104 00:08:53,926 --> 00:08:58,238 doesn't change the end as it goes so I might as well fill it in now. 105 00:08:58,238 --> 00:09:01,928 Okay, now let's see. Start at the base case example, if we 106 00:09:01,928 --> 00:09:06,344 find loe in empty, that's false, that puts a false here, which of course makes 107 00:09:06,344 --> 00:09:13,035 sense if you're looking for something in nothing and you can't find it. 108 00:09:13,035 --> 00:09:19,810 But here's a case that says, if we're looking for F3 and F1, we don't find it. 109 00:09:19,810 --> 00:09:24,410 If we're looking for F3 and F3, we do find it. 110 00:09:24,410 --> 00:09:28,230 So, it seems like that when we're looking at an element, there's kind of two cases. 111 00:09:28,230 --> 00:09:31,480 This is the case where we find it right away. 112 00:09:31,480 --> 00:09:35,250 Right there and right there we find it right away in those two cases. 113 00:09:38,040 --> 00:09:40,680 And then there's the case where we don't find it right away. 114 00:09:40,680 --> 00:09:48,028 So, that suggest an if in find--element. And the question is you know, is the name 115 00:09:48,028 --> 00:09:52,760 of this current element the name that we're looking for? 116 00:09:52,760 --> 00:09:56,130 So, both names are string. This nice little note here that I put in 117 00:09:56,130 --> 00:10:02,172 the template reminds me of that. Is the name of this element the name I'm 118 00:10:02,172 --> 00:10:09,076 looking for is the question. One thing about this little note I put in 119 00:10:09,076 --> 00:10:12,734 the template, in some other kinds of programming languages and in more 120 00:10:12,734 --> 00:10:20,020 sophisticated programming environments. This would be automatically given to you. 121 00:10:20,020 --> 00:10:22,540 You could just kind of hover here and it would say, oh, that's going to produce a 122 00:10:22,540 --> 00:10:25,240 string. So, we're just replicating a feature that 123 00:10:25,240 --> 00:10:27,900 you might get in some other languages already. 124 00:10:27,900 --> 00:10:32,657 I like putting it here explicitly to help you understand kind of where it's really 125 00:10:32,657 --> 00:10:35,620 coming from. We'll delete it now. 126 00:10:35,620 --> 00:10:38,988 So, let's see. If the name of the current element is the 127 00:10:38,988 --> 00:10:41,390 name we're looking for, then what are we supposed to do? 128 00:10:41,390 --> 00:10:45,600 Well, we're supposed to produce that element's data. 129 00:10:46,820 --> 00:10:49,804 So, there we go. We'll just produce its data. 130 00:10:49,804 --> 00:10:54,783 And if the name we're looking for isn't the current element. 131 00:10:54,783 --> 00:10:58,600 Well, what are we supposed to do in that case? 132 00:10:58,600 --> 00:11:02,790 Well, these other examples are saying that you keep looking. 133 00:11:02,790 --> 00:11:06,502 We got to go look in all the children. In other words, we got to do the natural 134 00:11:06,502 --> 00:11:10,467 neutral recursion and we should follow the standard advice, which is to trust 135 00:11:10,467 --> 00:11:15,730 the natural neutral recursion. This is going to go look in the children 136 00:11:15,730 --> 00:11:19,123 and whatever it produces will be the answer. 137 00:11:19,123 --> 00:11:22,807 Now, lets go down here. Now, we're looking at a list of elements 138 00:11:22,807 --> 00:11:26,272 and we don't actually have an example for this. 139 00:11:26,272 --> 00:11:30,260 So, let's put in an example for this. Let's say check, I'm going to put it 140 00:11:30,260 --> 00:11:36,128 right here because it's the, it's really a little bit simpler than the D4 example. 141 00:11:36,128 --> 00:11:45,363 I'm going to say find element. Find--loe I mean, in cons F1, cons F2 142 00:11:45,363 --> 00:11:49,960 empty. This is the subs of D4. 143 00:11:49,960 --> 00:11:58,649 Okay, this is the subs of D4. So, actually I'm not putting this in the 144 00:11:58,649 --> 00:12:02,150 right place. I should really be putting this up here. 145 00:12:02,150 --> 00:12:07,779 This is the sub of D4. And I'm finding, what do I want to find 146 00:12:07,779 --> 00:12:11,902 in there? Lets say I want to find F2. 147 00:12:11,902 --> 00:12:20,430 Can I find F2 in there? Yes, that should produce 2. 148 00:12:20,430 --> 00:12:24,738 And we'll add another one. Can I find F3 in there? 149 00:12:24,738 --> 00:12:35,143 And the answer to that is false. Now, I can really understand that these 150 00:12:35,143 --> 00:12:39,812 examples all have a bit more structure. It is really the base case example, 151 00:12:39,812 --> 00:12:44,882 there's the has no children examples when we're looking in F but not looking in 152 00:12:44,882 --> 00:12:52,147 anything that has children. And then this example here is, is a quite 153 00:12:52,147 --> 00:12:56,159 similar example because I'm moving it up here now because its really an immediate 154 00:12:56,159 --> 00:13:00,664 success example, this doesn't go looking at children. 155 00:13:00,664 --> 00:13:05,120 So, we look at nothing, look at things that have no children. 156 00:13:05,120 --> 00:13:09,980 We don't look in the children. Here, we look in lists of children. 157 00:13:09,980 --> 00:13:13,372 And now we start looking in things where we have to look at the element itself, 158 00:13:13,372 --> 00:13:17,076 and its children. So now, I've got the examples ordered 159 00:13:17,076 --> 00:13:21,536 more in terms of their complexity. And so find, dash, dash, loe, is really 160 00:13:21,536 --> 00:13:26,860 kind of here and here. Now, let's see. 161 00:13:26,860 --> 00:13:29,770 What are these two mutual recursions going to do? 162 00:13:29,770 --> 00:13:32,240 These two natural, mutual recursions, what are they going to do? 163 00:13:32,240 --> 00:13:44,840 This will produce integer or false depending on whether it's found in first 164 00:13:44,840 --> 00:13:52,829 of loe. Let's make that a little bit more 165 00:13:52,829 --> 00:13:58,949 compact. If n is found in first of loe. 166 00:13:58,949 --> 00:14:09,579 And this second one will produce integer false is n is found in rest of loe. 167 00:14:11,880 --> 00:14:15,410 Now, if we find it in the first, there's no use looking in the rest. 168 00:14:15,410 --> 00:14:18,202 We're supposed to find one of them, we don't need to find more than one. 169 00:14:18,202 --> 00:14:27,060 So, that suggests an if. Now what's the question? 170 00:14:27,060 --> 00:14:31,208 Well if this thing produces false If it can't find it, this natural mutual 171 00:14:31,208 --> 00:14:35,320 incursion produces False, if it can't find it. 172 00:14:35,320 --> 00:14:40,590 So, and I apologize for the double negative here, but if it doesn't produce 173 00:14:40,590 --> 00:14:45,450 False that means it found it in the first. 174 00:14:45,450 --> 00:14:51,990 So, they way I'm going to phrase this question is, if not false that and that 175 00:14:51,990 --> 00:14:58,229 will mean is its is it found in the first? 176 00:14:58,229 --> 00:15:05,077 Is, it's found in first loe and if it is found in first loe well then that's what 177 00:15:05,077 --> 00:15:11,562 we want. That value that came back, which is going 178 00:15:11,562 --> 00:15:17,844 to be an integer is what we want. And if it isn't found in the first of 179 00:15:17,844 --> 00:15:21,896 loe, well, then we better go look in the rest of the loe. 180 00:15:21,896 --> 00:15:29,718 That's the natural recursion. Now, let me just note this question here, 181 00:15:29,718 --> 00:15:33,710 which I've phrased as not false, find--element. 182 00:15:33,710 --> 00:15:36,464 I can also phrase this question as (integer? 183 00:15:36,464 --> 00:15:42,650 (find--element-n (first loe))). In other words, I could have used 184 00:15:42,650 --> 00:15:46,060 integer? Instead of not-false. 185 00:15:46,060 --> 00:15:49,234 The reason I used not-false is that it's more general for backtracking functions, 186 00:15:49,234 --> 00:15:55,462 when we see other backtracking functions. We'll have the same not false pattern. 187 00:15:55,462 --> 00:16:03,272 So, let's see. Maybe hopefully, I'm done, and I am. 188 00:16:03,272 --> 00:16:08,268 All the tests pass. Now, why is this called back tracking? 189 00:16:08,268 --> 00:16:12,040 Well, let's just step through a couple examples. 190 00:16:12,040 --> 00:16:16,200 Let's add one more example to this. Will put it up here. 191 00:16:16,200 --> 00:16:22,942 Let's look in D6 for D6. Well, if we look at this tree, we start 192 00:16:22,942 --> 00:16:34,170 at D6 and we're looking for D6 and immediately we find it so we're done. 193 00:16:37,330 --> 00:16:43,410 Now, here's another example. Let's look at D6. 194 00:16:43,410 --> 00:16:50,250 We'll add one more here. Let's look in D6 for F1. 195 00:16:50,250 --> 00:16:52,394 Whenever I add an example, I actually should run it. 196 00:16:52,394 --> 00:16:56,110 Just to be sure it's working. And they are. 197 00:16:56,110 --> 00:17:02,330 Now, let's look in D6 for F1. What happens is we start at the top of 198 00:17:02,330 --> 00:17:07,980 the tree, we start at D6. It's not F1, so we go to the subs. 199 00:17:07,980 --> 00:17:10,567 We go to the first of the subs, which is D4. 200 00:17:10,567 --> 00:17:14,800 It's not F1. So, we go to the subs. 201 00:17:14,800 --> 00:17:19,550 We go to the first of the subs, which is F1 and it is F1. 202 00:17:19,550 --> 00:17:22,330 So, we found what we want and we produce one. 203 00:17:23,980 --> 00:17:30,520 And now, let's look at what happens in this case, where we look in D6 for F3. 204 00:17:30,520 --> 00:17:34,520 Well, again, we start at D6. We haven't found F3. 205 00:17:34,520 --> 00:17:38,260 We go to D4. We haven't found F3. 206 00:17:38,260 --> 00:17:40,703 We go to F1. We haven't found F3. 207 00:17:43,460 --> 00:17:51,585 We go up to D4, now we look at F2 we still haven't found F3. 208 00:17:51,585 --> 00:17:56,709 We go back up to D4, we go back up to D6, now we come down to D5 we still haven't 209 00:17:56,709 --> 00:18:05,540 found it, we go down to F3, we found it. And each time we get to a failing node 210 00:18:05,540 --> 00:18:09,632 with no sub elements, those are called leaves because they have no branches, no 211 00:18:09,632 --> 00:18:15,420 branches anymore. Each time we get to a failing leaf and we 212 00:18:15,420 --> 00:18:19,775 go back up to its parent and then try the next. 213 00:18:19,775 --> 00:18:24,820 The next sub, that's called back tracking. 214 00:18:24,820 --> 00:18:29,955 We back up to the previous branch and go down the next branch and that's why this 215 00:18:29,955 --> 00:18:35,011 is called back tracking surge because it has this pattern of going all the way 216 00:18:35,011 --> 00:18:41,020 down to branches, looking for what we want. 217 00:18:41,020 --> 00:18:46,300 And anywhere we fail, we produce false to go back to the next branch and then that 218 00:18:46,300 --> 00:18:53,053 goes to the next sub, 'kay? And this if not false of looking in the 219 00:18:53,053 --> 00:19:00,060 first branch, pattern is emblematic of backtracking search. 220 00:19:00,060 --> 00:19:02,682 And as I said, we'll see more back-tracking search examples later in 221 00:19:02,682 --> 00:19:03,810 the course.