1 00:00:02,220 --> 00:00:06,080 In this sequence of lectures, we're going to learn Asymptotic Analysis. 2 00:00:06,080 --> 00:00:10,380 This is the language, by which every serious computer programmer and 3 00:00:10,380 --> 00:00:15,390 computer scientist, discusses, the high level performance, of computer algorithms. 4 00:00:15,390 --> 00:00:18,820 As such, it's a totally, crucial, topic. 5 00:00:18,820 --> 00:00:22,560 In this video, the plan is to segueway between the high level discussion that 6 00:00:22,560 --> 00:00:24,970 you've already seen in the course introduction. 7 00:00:24,970 --> 00:00:27,980 And the mathematical formalism which we're going to start 8 00:00:27,980 --> 00:00:29,890 developing in the next video. 9 00:00:29,890 --> 00:00:32,030 Before getting into that mathematical formalism, 10 00:00:32,030 --> 00:00:35,510 however, I want to make sure that the topic is well motivated. 11 00:00:35,510 --> 00:00:38,620 That you have solid intuition for what it's trying to accomplish, and 12 00:00:38,620 --> 00:00:42,220 also that you've seen a couple of simple, intuitive examples. 13 00:00:42,220 --> 00:00:42,930 Let's get started. 14 00:00:44,420 --> 00:00:47,500 Asymptotic analysis provides basic vocabulary for 15 00:00:47,500 --> 00:00:50,370 discussing the design and analysis of algorithms. 16 00:00:50,370 --> 00:00:54,560 And while it is a mathematical concept, it is by no means math for math's sake. 17 00:00:54,560 --> 00:00:59,260 You will very frequently hear serious programmers saying that such and such code 18 00:00:59,260 --> 00:01:03,860 runs in o of n time where such and such other code runs in o of n squared time. 19 00:01:03,860 --> 00:01:07,300 It's important you know what programmers mean when they make statements like that. 20 00:01:09,450 --> 00:01:14,000 The reason this vocabulary is so ubiquitous is that it identifies a sweet 21 00:01:14,000 --> 00:01:18,370 spot for discussing the high level performance of algorithms. 22 00:01:18,370 --> 00:01:21,030 What I mean by that, is it is on the one hand, 23 00:01:21,030 --> 00:01:25,070 coarse enough, to suppress all of the details that you want to ignore. 24 00:01:25,070 --> 00:01:27,590 Details that depend on the choice of architecture, 25 00:01:27,590 --> 00:01:31,230 the choice of programming language, the choice of compiler, and so on. 26 00:01:34,060 --> 00:01:37,020 On the other hand it's sharp enough to be useful. 27 00:01:37,020 --> 00:01:40,140 In particular, to make predictive comparisons 28 00:01:40,140 --> 00:01:45,035 between different high level algorithmic approaches to solving a common problem. 29 00:01:45,035 --> 00:01:48,315 This is going to be especially true for large inputs. 30 00:01:48,315 --> 00:01:52,635 And remember as we discussed in some sense, large inputs are the interesting 31 00:01:52,635 --> 00:01:56,727 ones, those are the ones for which we need algorithmic ingenuity. 32 00:01:56,727 --> 00:02:01,627 For example, asymptotic analysis will allow us to differentiate between better 33 00:02:01,627 --> 00:02:03,947 and worse approaches to sorting. 34 00:02:03,947 --> 00:02:07,737 Better and worse approaches to multiplying two integers and so on. 35 00:02:10,557 --> 00:02:12,847 Now most serious programmers if you ask them, 36 00:02:12,847 --> 00:02:15,480 what's the deal with asymptotic analysis anyways? 37 00:02:15,480 --> 00:02:19,740 They'll tell you reasonably, that the main point is to suppress 38 00:02:19,740 --> 00:02:23,180 both leading constant factors and lower order terms. 39 00:02:24,790 --> 00:02:29,870 Now, as we'll see, there's more asymptotic analysis than just these seven words here. 40 00:02:29,870 --> 00:02:30,920 But long term, 41 00:02:30,920 --> 00:02:34,940 ten years from now if you only remember seven words about asymptotic analysis, 42 00:02:34,940 --> 00:02:38,680 I'll be reasonably happy if these are the seven words that you remember. 43 00:02:38,680 --> 00:02:42,640 So how do we justify adopting a formalism which essentially by definition, 44 00:02:42,640 --> 00:02:45,720 suppresses constant factors and lower order terms? 45 00:02:45,720 --> 00:02:47,410 Well lower order terms basically, 46 00:02:47,410 --> 00:02:51,970 by definition, become increasingly irrelevant as you focus on large inputs. 47 00:02:51,970 --> 00:02:53,030 Which, as we've argued, 48 00:02:53,030 --> 00:02:57,420 are the interesting inputs, the ones where algorithmic ingenuity is important. 49 00:02:57,420 --> 00:03:00,890 As far as constant factors, these are going to be highly dependent on 50 00:03:00,890 --> 00:03:04,690 the details of the environment, the compiler, the language and so on. 51 00:03:04,690 --> 00:03:08,679 So if we want to ignore those details, it makes sense to have a formalism 52 00:03:08,679 --> 00:03:12,015 which doesn't focus unduly on leading constant factors. 53 00:03:14,332 --> 00:03:15,830 Here's an example. 54 00:03:15,830 --> 00:03:18,160 Remember when we analyzed the merge sort algorithm, 55 00:03:18,160 --> 00:03:24,050 we gave an upper bound on its running time that was 6 times n log n plus 6n. 56 00:03:24,050 --> 00:03:28,480 Where n was the input length, the number of numbers in the input array. 57 00:03:28,480 --> 00:03:30,850 So the lower order term here is the 6n. 58 00:03:30,850 --> 00:03:34,400 That's growing more slowly than than n log n, so we just drop that. 59 00:03:34,400 --> 00:03:38,690 And then the leading constant factor is the 6 so we suppress that as well. 60 00:03:38,690 --> 00:03:42,690 After those two suppressions we're left with a much simpler expression, n log n. 61 00:03:44,970 --> 00:03:49,710 The terminology would then be to say that the running time of merge sort is big O 62 00:03:49,710 --> 00:03:51,130 of n log n. 63 00:03:51,130 --> 00:03:51,770 So in other words, 64 00:03:51,770 --> 00:03:55,540 when you say that an algorithm's running time is big O of some function. 65 00:03:55,540 --> 00:03:59,112 What you mean is that after you've dropped the lower order terms, and 66 00:03:59,112 --> 00:04:04,380 suppressed the leading constant factor, you're left with the function f of n. 67 00:04:04,380 --> 00:04:09,550 Intuitively, that is what the big O notation means. 68 00:04:09,550 --> 00:04:12,520 So to be clear, I'm certainly not asserting thay 69 00:04:12,520 --> 00:04:17,560 constant factors never matter when you're designing and analyzing algorithms. 70 00:04:17,560 --> 00:04:20,390 Rather, I'm just saying that when you think about high level 71 00:04:20,390 --> 00:04:23,300 algorithmic approaches, when you might want to make a comparison between 72 00:04:23,300 --> 00:04:26,100 fundamentally different ways of solving a problem. 73 00:04:26,100 --> 00:04:29,110 Asymptotic analysis is often the right tool for 74 00:04:29,110 --> 00:04:32,320 giving you guidance about which one is going to perform better. 75 00:04:32,320 --> 00:04:35,500 Especially on reasonably large inputs. 76 00:04:35,500 --> 00:04:39,320 Now, once you've committed to a particularly algorithmic solution to 77 00:04:39,320 --> 00:04:40,290 a problem. 78 00:04:40,290 --> 00:04:44,690 Of course, you might want to then work harder to improve the leading constant 79 00:04:44,690 --> 00:04:48,530 factor, perhaps even to improve the lower order terms. 80 00:04:48,530 --> 00:04:52,690 By all means if the future of your start up depends on how efficiently you 81 00:04:52,690 --> 00:04:56,370 implement some particular set of lines of code, have at it. 82 00:04:56,370 --> 00:04:58,060 Make it as fast as you can. 83 00:04:58,060 --> 00:05:02,560 In the rest of this video I want to go through four very simple examples. 84 00:05:02,560 --> 00:05:06,030 In fact these examples are so simple, if you have any experience with big O 85 00:05:06,030 --> 00:05:09,030 notation, you're probably better off just skipping the rest of this video. 86 00:05:09,030 --> 00:05:12,880 And moving on to the mathematical formalism that we begin in the next video. 87 00:05:12,880 --> 00:05:14,570 But if you've never seen it before, 88 00:05:14,570 --> 00:05:16,539 I hope these simple examples will get you oriented. 89 00:05:17,630 --> 00:05:22,575 So let's begin with a very basic problem, searching an array for a given integer. 90 00:05:26,591 --> 00:05:29,145 Let's analyze the straight forward algorithm for 91 00:05:29,145 --> 00:05:32,570 this problem where we just do a linear scan through the array. 92 00:05:32,570 --> 00:05:35,490 Checking each entry to see if it is the desired integer t. 93 00:05:37,250 --> 00:05:40,470 That is the code just checks each array entry in turn. 94 00:05:40,470 --> 00:05:42,600 If it ever finds the integer t it returns TRUE. 95 00:05:42,600 --> 00:05:46,900 If it falls off the end of the array without finding it, it returns FALSE. 96 00:05:46,900 --> 00:05:47,560 So what do you think? 97 00:05:47,560 --> 00:05:49,790 We haven't formally defined big O notation, but 98 00:05:49,790 --> 00:05:51,770 I've given you an intuitive description. 99 00:05:51,770 --> 00:05:55,220 What would you say is the running time of this algorithm 100 00:05:55,220 --> 00:05:58,570 as a function of the length of the array, capital A? 101 00:06:03,270 --> 00:06:06,610 So the answer I'm looking for is C, big O of n. 102 00:06:06,610 --> 00:06:10,580 Or equivalently we would say that the running time of this algorithm is linear 103 00:06:10,580 --> 00:06:11,810 in the input length n. 104 00:06:12,900 --> 00:06:14,230 Why is that true? 105 00:06:14,230 --> 00:06:19,190 Well let's think about how many operations this piece of code is going to execute. 106 00:06:19,190 --> 00:06:22,350 Actually the lines of code executed is going to depend on the input. 107 00:06:22,350 --> 00:06:27,010 It depends on whether or not the target t is contained in the array A. 108 00:06:27,010 --> 00:06:29,370 And if so, where in the array A it lies. 109 00:06:29,370 --> 00:06:34,340 But, in the worst case this code will do an unsuccessful search. 110 00:06:34,340 --> 00:06:38,669 T will not be in the array and the code will scan through the entire array A and 111 00:06:38,669 --> 00:06:40,260 return FALSE. 112 00:06:40,260 --> 00:06:43,250 The number of operations then is a constant. 113 00:06:43,250 --> 00:06:45,050 There is some initial setup, perhaps and 114 00:06:45,050 --> 00:06:48,370 maybe it's an operation to return this final bullying value. 115 00:06:48,370 --> 00:06:52,370 But outside of that constant, which will get suppressed in the big O notation, 116 00:06:52,370 --> 00:06:56,290 it does a constant number of operations per entry in the array. 117 00:06:56,290 --> 00:06:58,020 And you can argue about what the constant is, 118 00:06:58,020 --> 00:07:01,960 if it's two, three, four operations per entry point in the array. 119 00:07:01,960 --> 00:07:04,950 But the point is, whatever that constant is two, three, or four, 120 00:07:04,950 --> 00:07:09,150 it gets conveniently suppressed by the big O notation. 121 00:07:09,150 --> 00:07:13,306 So as a result, the total number of operations would be linear in n, and 122 00:07:13,306 --> 00:07:15,613 so the big O notation will just be o of n. 123 00:07:17,968 --> 00:07:19,570 So that was the first example. 124 00:07:19,570 --> 00:07:20,980 In the last three examples, 125 00:07:20,980 --> 00:07:24,090 I want to look at different ways that we could have two loops. 126 00:07:24,090 --> 00:07:27,980 And in this example, I want to think about one loop followed by another, so 127 00:07:27,980 --> 00:07:29,758 two loops in sequence. 128 00:07:29,758 --> 00:07:32,480 I want to study almost the same problem as the previous one. 129 00:07:32,480 --> 00:07:35,476 Where now we are just given two arrays, capital A and capital B, 130 00:07:35,476 --> 00:07:37,208 let's say both are the same length. 131 00:07:37,208 --> 00:07:40,987 And we want to know whether the target t is in either one of them. 132 00:07:40,987 --> 00:07:44,415 Again, we'll look at the straight forward algorithm that we just searched through a. 133 00:07:44,415 --> 00:07:47,725 And if we fail to find t and a, we search through b. 134 00:07:47,725 --> 00:07:50,440 If we don't find t and b either, than we have to return return FALSE. 135 00:07:52,670 --> 00:07:55,660 So the question then is exactly the same as last time given this new, 136 00:07:55,660 --> 00:07:57,570 longer piece of code. 137 00:07:57,570 --> 00:08:00,810 What in big O notation is it's running time? 138 00:08:05,097 --> 00:08:09,110 Well, the question was the same and in this case the answer was the same. 139 00:08:09,110 --> 00:08:14,020 So this algorithm, just like the last one, has running time big O of n. 140 00:08:14,020 --> 00:08:17,040 If we actually count the number of operations 141 00:08:17,040 --> 00:08:19,376 of course it won't be exactly the same as last time. 142 00:08:19,376 --> 00:08:23,832 It'll be roughly twice as many operations as the previous piece of code. 143 00:08:23,832 --> 00:08:27,450 That's because we have to search two different arrays each of link in. 144 00:08:27,450 --> 00:08:31,030 So, whatever work we did before we now do it twice as many times. 145 00:08:31,030 --> 00:08:33,880 Of course that two being a constant. 146 00:08:33,880 --> 00:08:36,210 Independent of the input length n, 147 00:08:36,210 --> 00:08:39,770 is going to get suppressed once we passed a big O notation. 148 00:08:39,770 --> 00:08:43,990 So this, like the previous algorithm, is a linear time algorithm, 149 00:08:43,990 --> 00:08:48,110 it has running time big O of n. 150 00:08:48,110 --> 00:08:50,940 Let's look at a more interesting example of two loops, 151 00:08:50,940 --> 00:08:56,920 where rather than processing each loop in sequence, they're going to be nested. 152 00:08:56,920 --> 00:09:00,950 In particular, let's look at the problem of searching whether two given input 153 00:09:00,950 --> 00:09:04,890 arrays each of link n, contain a common number. 154 00:09:06,270 --> 00:09:09,130 The code that we're going to look at for solving this problem is the most 155 00:09:09,130 --> 00:09:13,220 straight forward one you can imagine, where we just compare all possibilities. 156 00:09:13,220 --> 00:09:17,810 So for each index i into the array a, each index j into the array b. 157 00:09:17,810 --> 00:09:21,614 We just see a if A[i] is the the same number as B[j]. 158 00:09:21,614 --> 00:09:23,340 If it is we return TRUE. 159 00:09:23,340 --> 00:09:27,060 If we exhaust all of the possibilities without ever finding equal elements, 160 00:09:27,060 --> 00:09:29,220 then we're safe in returning a FALSE. 161 00:09:30,640 --> 00:09:35,717 The question of course is in terms of big O notation asymptotic analysis 162 00:09:35,717 --> 00:09:41,568 as a function of the array length n, what is the running time of this piece of code? 163 00:09:45,281 --> 00:09:48,470 So this time the answer has changed. 164 00:09:48,470 --> 00:09:53,348 For this piece of code the running time is not big O with n, but 165 00:09:53,348 --> 00:09:55,680 it is big O of n squared. 166 00:09:55,680 --> 00:09:59,660 So, we might also called this a quadratic time algorithm. 167 00:09:59,660 --> 00:10:03,250 Because the running time is quadratic in the input length n. 168 00:10:03,250 --> 00:10:05,760 So, this is one of those kinds of algorithms where if you double 169 00:10:05,760 --> 00:10:06,850 the input link. 170 00:10:06,850 --> 00:10:09,610 Then the running time of the algorithm will go up by a factor of four 171 00:10:09,610 --> 00:10:13,400 rather than by a factor of two, like in the previous two pieces of code. 172 00:10:13,400 --> 00:10:14,540 So why is this? 173 00:10:14,540 --> 00:10:16,730 Why does it have quadratic running time big O of n squared? 174 00:10:16,730 --> 00:10:17,720 Well again, 175 00:10:17,720 --> 00:10:21,400 there's some constant setup costs which gets suppressed in the big O notation. 176 00:10:21,400 --> 00:10:27,680 Again, for each fixed choice of an entry i into array A and an index j for array B, 177 00:10:27,680 --> 00:10:32,300 for each fixed choice of i and j, we only do a constant number of operations. 178 00:10:32,300 --> 00:10:35,170 The particular constants are relevant because it gets suppressed in the big O 179 00:10:35,170 --> 00:10:36,030 notation. 180 00:10:36,030 --> 00:10:39,720 What's difference is that there's a total of n squared 181 00:10:39,720 --> 00:10:42,600 iterations of this double four loop. 182 00:10:42,600 --> 00:10:46,110 In the first example we only had n iterations of a single four loop. 183 00:10:46,110 --> 00:10:49,850 In our second example because one four loop completed before the second one 184 00:10:49,850 --> 00:10:52,850 began we had only 2n iterations overall. 185 00:10:52,850 --> 00:10:56,730 Here, for each of the n iterations of the outer four loop, 186 00:10:56,730 --> 00:10:59,790 we do n iterations of the inner four loop. 187 00:10:59,790 --> 00:11:05,080 So that gives us the n times n, ie n squared total iterations. 188 00:11:05,080 --> 00:11:07,480 So that's going to be the running time of this piece of code. 189 00:11:08,860 --> 00:11:12,870 Lets wrap up with one final example, it will again be nested four loops. 190 00:11:12,870 --> 00:11:17,180 But this time we're going to be looking for duplicates in a single array A. 191 00:11:17,180 --> 00:11:20,020 Rather than needing to compare two distinct arrays A and B. 192 00:11:21,790 --> 00:11:24,700 So here's the piece of code we're going to analyze for solving this problem, for 193 00:11:24,700 --> 00:11:28,410 detecting whether or not the input array A has duplicate entries. 194 00:11:28,410 --> 00:11:31,360 There's only two small differences relative to the code that 195 00:11:31,360 --> 00:11:34,249 we went through on the previous slide when we had two different arrays. 196 00:11:36,180 --> 00:11:40,045 The first change won't surprise you at all which is instead of referencing 197 00:11:40,045 --> 00:11:41,899 the array B, I change that B to an A. 198 00:11:41,899 --> 00:11:42,507 All right, so 199 00:11:42,507 --> 00:11:46,040 I'm just going to compare the i th entry of A with the j th entry of A. 200 00:11:46,040 --> 00:11:47,650 The second change is a little bit more subtle. 201 00:11:47,650 --> 00:11:54,056 Which is I changed the inner for loops so that the index j begins at i plus one. 202 00:11:54,056 --> 00:11:57,553 Where i is the current value of the outer four loop's index, 203 00:11:57,553 --> 00:11:59,824 rather than starting at the index one. 204 00:11:59,824 --> 00:12:03,145 I could have had it start at the index one, that would still be correct, but 205 00:12:03,145 --> 00:12:05,940 it would be wasteful and you should think about why. 206 00:12:05,940 --> 00:12:09,700 If we started the inner four loops index at one, then this code would actually 207 00:12:09,700 --> 00:12:13,350 compare each distinct pair of elements of A to each other twice. 208 00:12:13,350 --> 00:12:14,470 Which of course is silly. 209 00:12:14,470 --> 00:12:17,600 You only need to compare two different elements of A to each other once, 210 00:12:17,600 --> 00:12:20,220 to know whether they're equal or not. 211 00:12:20,220 --> 00:12:24,160 So this is the piece of code, the question is the same as it always is. 212 00:12:24,160 --> 00:12:26,390 What in terms of big O notation and 213 00:12:26,390 --> 00:12:30,223 the input link n is the running time of this piece of code? 214 00:12:35,458 --> 00:12:39,296 So the answer to this question, same as the last one, big O of n squared. 215 00:12:39,296 --> 00:12:43,980 That is this piece of code is also has quadratic running time. 216 00:12:43,980 --> 00:12:47,910 So what I hope was clear was that whatever the running time of this piece of code is. 217 00:12:47,910 --> 00:12:51,240 It's proportional to the number of iterations of this double four loop. 218 00:12:51,240 --> 00:12:54,810 Like in all the examples, we do constant work per iteration. 219 00:12:54,810 --> 00:12:57,950 We don't care about the constant, it gets suppressed by the Big O notation. 220 00:12:57,950 --> 00:13:00,590 So all we gotta do is figure out how many iterations there 221 00:13:00,590 --> 00:13:02,970 are of this double for loop. 222 00:13:02,970 --> 00:13:06,230 My claim is that there's roughly n squared over two 223 00:13:06,230 --> 00:13:08,210 iterations of this double four loop. 224 00:13:08,210 --> 00:13:09,830 There's a couple ways to see that. 225 00:13:09,830 --> 00:13:13,760 Informally we discussed how the difference between this code and the previous one, 226 00:13:13,760 --> 00:13:16,430 is that instead of counting something twice, we're counting it once. 227 00:13:16,430 --> 00:13:19,680 So that saves us a factor of two in the number of iterations. 228 00:13:19,680 --> 00:13:23,900 Of course, this one half factor gets suppressed by the big O notation anyways. 229 00:13:23,900 --> 00:13:26,580 So the big O running time doesn't change. 230 00:13:26,580 --> 00:13:30,410 A different argument would just say, there's one iteration for 231 00:13:30,410 --> 00:13:34,410 every distinct choice of i and j of indices between one and n. 232 00:13:34,410 --> 00:13:36,890 And a simple counting argument says that there's 233 00:13:36,890 --> 00:13:40,170 n choose two such choices of distinct i and j. 234 00:13:40,170 --> 00:13:44,010 Where n choose two is the number n times n minus one over two. 235 00:13:44,010 --> 00:13:47,060 And again suppressing lower order terms in the constant factor, 236 00:13:47,060 --> 00:13:51,100 we still get a quadratic dependence on the length of the input array A. 237 00:13:52,240 --> 00:13:55,540 So that wraps up some of the sort of just simple basic examples. 238 00:13:55,540 --> 00:13:59,680 I hope this gets you oriented, you have a strong intuitive sense, for what Big O 239 00:13:59,680 --> 00:14:03,650 notation is trying to accomplish, and how it's defined mathematically. 240 00:14:03,650 --> 00:14:07,273 Let's now move on to both the mathematical development and 241 00:14:07,273 --> 00:14:09,526 some more interesting algorithms.