1 00:00:00,000 --> 00:00:04,237 So in this video we'll finally discuss Huffman's Algorithm which is a greedy 2 00:00:04,237 --> 00:00:08,474 Algorithm that constructs the prefix free binary code minimizing the average 3 00:00:08,474 --> 00:00:11,445 encoding length. So, let me just briefly remind you the 4 00:00:11,445 --> 00:00:15,682 formal statement of the compositional problem that I left you with last time. 5 00:00:15,682 --> 00:00:19,919 So, the input is just a frequency for each symbol I, coming from some alphabet 6 00:00:19,919 --> 00:00:22,594 sigma. So the responsibility of the algorithm is 7 00:00:22,594 --> 00:00:25,775 to get optimal compression, so to compute an optimal code. 8 00:00:25,775 --> 00:00:29,180 So the code has to be binary. We have to use only zeros and ones. 9 00:00:29,180 --> 00:00:33,533 We have to be prefix free meaning the encodings of any two characters, neither 10 00:00:33,533 --> 00:00:37,217 one can be a prefix of the other, that's to facilitate unambiguous 11 00:00:37,217 --> 00:00:39,830 decoding. And finally, the average number of bits 12 00:00:39,830 --> 00:00:43,942 needed to encode a character, where the average is with respect to the input 13 00:00:43,942 --> 00:00:46,269 frequencies, should be as small as possible. 14 00:00:46,269 --> 00:00:49,570 Now remember these kinds of codes correspond to binary trees. 15 00:00:49,570 --> 00:00:54,007 The prefix free condition just says that the symbols of sigma should be in one to 16 00:00:54,007 --> 00:00:56,550 one correspondence with the leaves of the tree. 17 00:00:56,550 --> 00:01:01,096 And finally remember the encoding lengths of the various symbols just correspond to 18 00:01:01,096 --> 00:01:05,126 depths of the corresponding leaves. So we can formally express the averaging 19 00:01:05,126 --> 00:01:07,835 coding length. Which given a legal tree capital T. 20 00:01:07,835 --> 00:01:10,701 I'm using the notation capital L of T. So what do we do? 21 00:01:10,701 --> 00:01:15,026 We just take the average over the symbols of the alphabet weighted by the provided 22 00:01:15,026 --> 00:01:18,726 frequencies and we just look at the number of bits used to encode that 23 00:01:18,726 --> 00:01:21,175 symbol. Equivalently depth of the corresponding 24 00:01:21,175 --> 00:01:24,041 leaf in the tree T. So we want the tree that makes this 25 00:01:24,041 --> 00:01:27,899 number as small as possible. So this task is a little bit different 26 00:01:27,899 --> 00:01:30,188 than any we've seen so far in this course, right? 27 00:01:30,188 --> 00:01:33,908 All we're given as an input is an array of numbers and yet we have to produce 28 00:01:33,908 --> 00:01:36,912 this actual full blown tree. So how can we just take a bunch of 29 00:01:36,912 --> 00:01:39,487 numbers and produce them in a sensible, principled way? 30 00:01:39,487 --> 00:01:42,777 Some kind of tree whose leaves correspond to symbols of the alphabet. 31 00:01:42,777 --> 00:01:46,306 So let's spend a slide just thinking about, you know, at a high level where 32 00:01:46,306 --> 00:01:49,930 would this tree come from, how would you build it up given this unstructured 33 00:01:49,930 --> 00:01:52,171 input. So there's certainly no unique answer to 34 00:01:52,171 --> 00:01:55,891 this question and one idea which is very natural but that turns out to be sub 35 00:01:55,891 --> 00:01:59,610 optimal is to build a tree using a top down approach, which you can also think 36 00:01:59,610 --> 00:02:01,903 of as an. Initiation of the divide and conquer 37 00:02:01,903 --> 00:02:05,176 algorithm design paradigm. The divide and conquer paradigm, you'll 38 00:02:05,176 --> 00:02:08,650 recall, involves breaking the given sub-problem into usually multiple, 39 00:02:08,650 --> 00:02:11,470 smaller sub-problems. They're solved recursively, and the 40 00:02:11,470 --> 00:02:14,592 solutions are then combined into one for the original problem. 41 00:02:14,592 --> 00:02:18,268 Because trees, the desired output here, have a recursive substructure, it's 42 00:02:18,268 --> 00:02:21,390 natural to think about applying this paradigm to this problem. 43 00:02:21,390 --> 00:02:26,077 Specifically you love to just lean on recursive call to construct for you the 44 00:02:26,077 --> 00:02:29,322 left sub tree, another sub call constructing the right 45 00:02:29,322 --> 00:02:31,906 sub tree. And then you can fuse the results 46 00:02:31,906 --> 00:02:36,150 together under a common root vertex. So it's not clear how to do this 47 00:02:36,150 --> 00:02:41,066 partitioning of the symbols into two groups, but one idea to get the most bang 48 00:02:41,066 --> 00:02:45,920 for your buck, the most information out of the first bit of your encoding. You 49 00:02:45,920 --> 00:02:50,900 might want to split them in, the symbols, into groups that have roughly, as close 50 00:02:50,900 --> 00:02:54,280 to as possible, of 50% of the overall frequency. 51 00:02:54,280 --> 00:02:58,654 So this topped out approach is sometimes called Fanno-Shannon coding. 52 00:02:58,654 --> 00:03:03,787 Fanno was Huffman's graduate adviser. Shannon is the Claude Shannon inventor of 53 00:03:03,787 --> 00:03:07,345 information theory. But Huffman in a term paper believe it or 54 00:03:07,345 --> 00:03:10,261 not, realized the topped down is not the way to go. 55 00:03:10,261 --> 00:03:13,410 The way to go is to build the tree from the bottom up. 56 00:03:13,410 --> 00:03:17,318 Not only are we going to get optimal codes, but we're going to get the 57 00:03:17,318 --> 00:03:20,410 blazingly fast greedy algorithm that constructs them. 58 00:03:20,410 --> 00:03:24,301 So what do I mean by bottom up? I mean we're going to start with just a 59 00:03:24,301 --> 00:03:27,863 bunch of nodes, each one labelled with one symbol of the alphabet. 60 00:03:27,863 --> 00:03:30,932 So, in effect, we're starting with the leaves of our tree. 61 00:03:30,932 --> 00:03:33,453 And then we're going to do successive mergers. 62 00:03:33,453 --> 00:03:37,782 We're going to take at each step two sub-trees thus far and link them together 63 00:03:37,782 --> 00:03:40,084 as sub-trees under a common internal node. 64 00:03:40,084 --> 00:03:43,620 So, I think you'll see what I mean in a picture. 65 00:03:43,620 --> 00:03:48,420 So imagine we want to build a tree where the leaves are supposed to be just A, B, 66 00:03:48,420 --> 00:03:51,120 C, and D. So one merger would be oh, well let's 67 00:03:51,120 --> 00:03:56,200 just take the leaves C and D and link them as siblings under a common ancestor. 68 00:03:56,200 --> 00:04:00,989 Now in the second step let's merge the leaf B with the sub-tree we got from the 69 00:04:00,989 --> 00:04:05,181 previous merge, the sub-tree comprising the nodes C, D, and their common 70 00:04:05,181 --> 00:04:08,797 ancestor. So now of course we have no choice but to 71 00:04:08,797 --> 00:04:11,475 merge the only two sub-trees we have left, 72 00:04:11,475 --> 00:04:16,002 and then that gives us a single tree. Which is in fact exactly the same 73 00:04:16,002 --> 00:04:20,720 lopsided tree we were using in the previous video as a running example. 74 00:04:21,800 --> 00:04:25,592 So let me explain what I hope is clear and what is maybe unclear at this 75 00:04:25,592 --> 00:04:28,137 juncture. I hope what's intuitively clear is that 76 00:04:28,137 --> 00:04:32,293 the bottom of approaches, a systematic way to build trees that have a prescribed 77 00:04:32,293 --> 00:04:34,008 set of leaves. So what do we want? 78 00:04:34,008 --> 00:04:38,060 We want trees whose leaves are labeled with the symbols of the alphabet sigma. 79 00:04:38,060 --> 00:04:41,956 So if we have an alphabet with n symbols, we're going to start with just the N 80 00:04:41,956 --> 00:04:43,470 leaves. What does a merger do? 81 00:04:43,470 --> 00:04:46,451 Well, there's two things. First of all, it introduces a new 82 00:04:46,451 --> 00:04:50,112 internal node, an unlabelled node. And secondly, it takes two of our old 83 00:04:50,112 --> 00:04:52,936 subtrees and fuses them into one, merges them into one. 84 00:04:52,936 --> 00:04:57,015 We take two subtrees, we make one the left child of this new internal node, the 85 00:04:57,015 --> 00:04:59,525 other, the right child of this new internal node. 86 00:04:59,525 --> 00:05:02,820 So that drops the number of subtrees we're working with by one. 87 00:05:02,820 --> 00:05:07,518 So if we start with the N leaves and we do N minus one successive mergers, what 88 00:05:07,518 --> 00:05:10,298 happens? Well on the one hand, we introduce N 89 00:05:10,298 --> 00:05:14,930 minus one new unlabeled internal nodes. And on the other hand, we construct a 90 00:05:14,930 --> 00:05:18,372 single tree. And the leaves of this tree that we get 91 00:05:18,372 --> 00:05:23,520 are in one to one correspondence with the alphabet letters as desired. 92 00:05:23,520 --> 00:05:27,178 Now what I don't expect you to have an intuition for is what should we be 93 00:05:27,178 --> 00:05:30,391 merging with what and why? Forget about, you know, how do we get an 94 00:05:30,391 --> 00:05:34,099 optimal tree at the end of the day. I mean, even just if we wanted to design 95 00:05:34,099 --> 00:05:36,769 a greedy algorithm. If we just wanted to make a myopic 96 00:05:36,769 --> 00:05:39,784 decision, that looks good right now, how would we even do that? 97 00:05:39,784 --> 00:05:43,838 What's our greedy criteria that's going to guide us to merge a particular pair of 98 00:05:43,838 --> 00:05:46,484 trees together? So we can re-frame this quandary in the 99 00:05:46,484 --> 00:05:50,007 same kind of question we asked for minimum cost spanning trees and really 100 00:05:50,007 --> 00:05:53,768 more generally with greedy algorithms. When you're making irrevocable decisions 101 00:05:53,768 --> 00:05:57,576 which strikes fear in your heart, is that this decision will come back and haunt 102 00:05:57,576 --> 00:06:00,004 you later on. You'll only realize at the end of the 103 00:06:00,004 --> 00:06:03,432 algorithm that you made some horrible mistake early on in the algorithm. 104 00:06:03,432 --> 00:06:07,003 So just as for MST's, we ask, you know, when can we be sure that including an 105 00:06:07,003 --> 00:06:10,621 edge irrevocably is not a mistake, it's safe in the tree that we're building? 106 00:06:10,621 --> 00:06:14,287 Here we want to ask, you know, we have to do a merger, we want to do successive 107 00:06:14,287 --> 00:06:16,620 mergers and how do we know that a merger is safe? 108 00:06:16,620 --> 00:06:20,620 That it doesn't prevent us from eventually computing an optimum solution. 109 00:06:20,620 --> 00:06:24,382 Well here's the way to look at things that will at least give us an intuitive 110 00:06:24,382 --> 00:06:27,710 conjecture for this question. We'll save the proof for the next video. 111 00:06:27,710 --> 00:06:31,902 So what are the ramifications when we merge two subtrees, each containing a 112 00:06:31,902 --> 00:06:35,088 collection of symbols? Well, when we merge two subtrees, we 113 00:06:35,088 --> 00:06:39,392 introduce a new internal node which unites these two subtrees under them, and 114 00:06:39,392 --> 00:06:43,528 if you think about it, at the end of the day on the final tree, this is yet 115 00:06:43,528 --> 00:06:47,050 another internal node that's going to be on the root to leaf path, 116 00:06:47,050 --> 00:06:49,370 for all of the leaves in these two sub trees. 117 00:06:49,370 --> 00:06:53,340 That is, if you're a symbol and you're watching your subtree get merged with 118 00:06:53,340 --> 00:06:56,072 somebody else. You're bummed out, your like, man that's 119 00:06:56,072 --> 00:06:59,527 another bit in my encoding. That's yet one more node I have to pass 120 00:06:59,527 --> 00:07:03,291 through to get back to the root. I think this will become even more clear 121 00:07:03,291 --> 00:07:06,654 if we look at an example. So naturally, we'll use our four symbol 122 00:07:06,654 --> 00:07:09,527 alphabet A, B, C, D. And initially, before we've merged 123 00:07:09,527 --> 00:07:12,080 anything, each of these is just its own leaf A, B. 124 00:07:12,080 --> 00:07:14,368 C, D. So there's no internal nodes above 'em. 125 00:07:14,368 --> 00:07:18,145 In the sense, everybody's encoding length at the beginning is zero bits. 126 00:07:18,145 --> 00:07:22,294 So now, imagine we've merged C and D, we introduce a new internal node, which is 127 00:07:22,294 --> 00:07:24,582 the common an-ancestor of these two leaves. 128 00:07:24,582 --> 00:07:28,572 And as a result, C and D are bummed out. They said, well, there's one bit that 129 00:07:28,572 --> 00:07:32,508 we're going to, to have to incur our encoding length, there's one new internal 130 00:07:32,508 --> 00:07:36,605 node we're always going to have to pass through enroute back to the root of the 131 00:07:36,605 --> 00:07:39,520 eventual tree. Now suppose next we merge B with the 132 00:07:39,520 --> 00:07:43,801 subtree containing both C and D. Well everybody but A is bummed out about, 133 00:07:43,801 --> 00:07:48,493 about this merger because we introduce another internal node, and it contributes 134 00:07:48,493 --> 00:07:51,133 one bit to the encoding of each of B, C, and D. 135 00:07:51,133 --> 00:07:55,590 It contributes an extra one to the encoding of C and D, and it contributes a 136 00:07:55,590 --> 00:07:59,461 zero to the encoding of B. So all of their encodings in some sense 137 00:07:59,461 --> 00:08:03,450 inc, get incremented, go up by one, compared to how things were before. 138 00:08:03,450 --> 00:08:07,203 Now in the final iteration, we have to merge everything together. 139 00:08:07,203 --> 00:08:10,135 We have no choice, there's only two sub-trees left. 140 00:08:10,135 --> 00:08:13,536 So here, everybody's encoding length gets bumped up by one. 141 00:08:13,536 --> 00:08:18,168 So, A finally picks up a bit at zero to encode it and B, C, and D each pick up an 142 00:08:18,168 --> 00:08:22,332 additional bit of one, which is prepenned into their encodings thus far. 143 00:08:22,332 --> 00:08:26,629 And, in general what you'll notice is that the final encoding length of a 144 00:08:26,629 --> 00:08:31,151 symbol, is precisely the number of mergers that its subtree has to endure. 145 00:08:31,151 --> 00:08:36,231 Every time your subtree gets merged with somebody else you pick up another bit in 146 00:08:36,231 --> 00:08:40,939 your encoding, and that's because there's one more internal node that you're going 147 00:08:40,939 --> 00:08:44,780 to have to traverse enroute to the root of the final tree. 148 00:08:44,780 --> 00:08:48,640 In this example, the symbols C and D, well they got merged with somebody else 149 00:08:48,640 --> 00:08:52,297 in each of the three iterations. So, they're the two symbols that wind up 150 00:08:52,297 --> 00:08:56,004 with the encoding length of three. The symbol B, it didn't get merged with 151 00:08:56,004 --> 00:08:58,595 anybody in the first iteration, only the second two. 152 00:08:58,595 --> 00:09:00,830 That's why it has an encoding length of two. 153 00:09:00,830 --> 00:09:04,969 So this is really helpful. So this, lets us relate, the operations 154 00:09:04,969 --> 00:09:10,080 that our algorithm actually does, namely mergers back to the objective function 155 00:09:10,080 --> 00:09:13,573 that we care about, namely the average encoding length. 156 00:09:13,573 --> 00:09:18,360 Mergers increase the encoding lengths of the participating symbols by one. 157 00:09:18,360 --> 00:09:22,734 So this allows us to segway into a design of a greedy heuristic for how to do 158 00:09:22,734 --> 00:09:25,314 mergers. Let's just think about the very first 159 00:09:25,314 --> 00:09:28,174 iteration. So we have our N original symbols, and we 160 00:09:28,174 --> 00:09:31,932 have to pick two to merge. And remember the consequences of a merge 161 00:09:31,932 --> 00:09:36,138 is going to be an increase in the encoding length by one bit, whichever two 162 00:09:36,138 --> 00:09:39,896 symbols we're going to pick. Now we want to do is minimize the average 163 00:09:39,896 --> 00:09:43,485 encoding length with respect to the frequencies that were given. 164 00:09:43,485 --> 00:09:48,181 So which pair of symbols are we the least unhappy to suffer an increment to their 165 00:09:48,181 --> 00:09:52,435 encoding length, was going to be the symbols that are the least frequent. 166 00:09:52,435 --> 00:09:56,217 That's going to increase your averaging poding length by the least. 167 00:09:56,217 --> 00:10:00,000 So that's the green merging hiuristic. Somethings gotta increase. 168 00:10:00,000 --> 00:10:04,519 Pick the ones that are the least frequent to be the ones that get incremented. 169 00:10:04,519 --> 00:10:09,096 So that seems like a really good idea of how to proceed in the first iteration. 170 00:10:09,096 --> 00:10:11,993 The next question is, how are we going to recurse? 171 00:10:11,993 --> 00:10:15,180 So, I'll let you think about that in the following quiz. 172 00:10:21,740 --> 00:10:25,776 So let's agree that the first iteration of our greedy heuristic is going to merge 173 00:10:25,776 --> 00:10:28,779 together the two symbols that possess the lowest frequencies. 174 00:10:28,779 --> 00:10:31,290 Let's call those two symbols little A and little B. 175 00:10:31,290 --> 00:10:33,899 The question is then how do we make further progress? 176 00:10:33,899 --> 00:10:36,853 What do we do next? Well, one thing that would be really nice 177 00:10:36,853 --> 00:10:39,610 is if we could somehow recurse on a smaller subproblem. 178 00:10:39,610 --> 00:10:43,739 Well, which smaller subproblem? Well, what does it mean that we've merged 179 00:10:43,739 --> 00:10:47,640 together the symbols A and B? Well, If you think about it, in the tree 180 00:10:47,640 --> 00:10:51,999 that we finally construct by virtue of us merging A and B, we're forcing the 181 00:10:51,999 --> 00:10:56,359 algorithm to output a tree in which A and B are siblings, in which they have 182 00:10:56,359 --> 00:10:59,915 exactly the same parent. So what does it mean for the encoding 183 00:10:59,915 --> 00:11:04,217 that we compute that A and B are going to be siblings with the same parent? 184 00:11:04,217 --> 00:11:07,430 It means that their encodings are going to be identical, 185 00:11:07,430 --> 00:11:11,430 save to the lowest order bits. So A will get encoded with a bunch of 186 00:11:11,430 --> 00:11:15,665 bits followed by a zero, B will be encoded by exactly the same prefix of 187 00:11:15,665 --> 00:11:18,960 bits followed by A1. So they're going to have almost of the 188 00:11:18,960 --> 00:11:23,430 same encoding, so for our recursion let's just treat them as the same symbol. 189 00:11:23,430 --> 00:11:28,254 So let's introduce a new meta symbol, let's call it AB, which represents the 190 00:11:28,254 --> 00:11:31,843 conjunction of A and B. So it's meant to represent all of the 191 00:11:31,843 --> 00:11:36,020 frequencies of either one, all of the occurrences of either one of them. 192 00:11:37,760 --> 00:11:42,365 But remember the input to the computational problem that we're studying 193 00:11:42,365 --> 00:11:47,034 is not just an alphabet, but also frequencies of each of these symbols of 194 00:11:47,034 --> 00:11:50,104 that alphabet. So my question for you is when we 195 00:11:50,104 --> 00:11:54,709 introduce this new meta symbol A B. What should be the frequency that we 196 00:11:54,709 --> 00:12:02,600 define for this meta symbol? All right, so hopefully your intuition 197 00:12:02,600 --> 00:12:07,403 suggested answer C, that for this recursion to make sense, for it to 198 00:12:07,403 --> 00:12:11,060 conform to our semantics of what this merging does. 199 00:12:11,060 --> 00:12:15,257 We should define the frequency of this new meta symbol to be the sum of the 200 00:12:15,257 --> 00:12:18,074 frequencies of the two symbols that it's replacing. 201 00:12:18,074 --> 00:12:22,603 That's because, remember this meta symbol AB is meant to represent all occurrences 202 00:12:22,603 --> 00:12:25,309 of both A and B. So it should be the sum of their 203 00:12:25,309 --> 00:12:28,071 frequencies. So I'm now ready to formally describe 204 00:12:28,071 --> 00:12:31,716 Huffman's greedy algorithm. Let me first describe it on an example 205 00:12:31,716 --> 00:12:34,920 and then I think of the general code will be, self evident. 206 00:12:34,920 --> 00:12:40,693 So let's just use our usual example, so we're going to have letters A, B, C, D, 207 00:12:40,693 --> 00:12:48,492 with frequencies 60, 25, 10, and 5. So we're going to use the bottom up 208 00:12:48,492 --> 00:12:52,782 approach, so we begin with each symbol just as its own node, A, B, C, D. 209 00:12:52,782 --> 00:12:56,176 I'm annotating in red the frequencies 60, 25, 10, 5. 210 00:12:56,176 --> 00:13:01,298 The greedy heuristic says initially we should merge together the two nodes that 211 00:13:01,298 --> 00:13:05,652 have the smallest frequencies. So that would be the C and the D with 212 00:13:05,652 --> 00:13:10,293 their frequencies of 10 and 5. Now based on the idea of the last slide 213 00:13:10,293 --> 00:13:15,282 when we merged C and D, we replaced them with a meta symbol cd whose frequency is 214 00:13:15,282 --> 00:13:18,546 the sum of the frequencies of C and D, namely fifteen. 215 00:13:18,546 --> 00:13:23,165 Now we just run another iteration of the greedy algorithm meaning we merge 216 00:13:23,165 --> 00:13:27,723 together the two nodes, the two symbols that have the smallest frequencies. 217 00:13:27,723 --> 00:13:32,219 So this is now B 25, and CD 15. So now we're down to just two symbols, 218 00:13:32,219 --> 00:13:36,838 the original symbol A, which still has frequencies 60 and the in some sense, 219 00:13:36,838 --> 00:13:42,070 meta, meta symbol BC, D who is now cumulative frequency is 40. 220 00:13:42,070 --> 00:13:45,752 So when we're down to two nodes, that's going to be the point in Huffman's 221 00:13:45,752 --> 00:13:49,487 algorithm where we hit the base case and the recursion begins to unwind. 222 00:13:49,487 --> 00:13:53,637 So if you just have two symbols to encode there's pretty much only, one sensible 223 00:13:53,637 --> 00:13:56,240 way to do it. One is a zero, and one is a one. 224 00:13:56,240 --> 00:14:01,855 So at this point, the final recursive call just returns now a tree with two 225 00:14:01,855 --> 00:14:05,300 leaves corresponding to the symbols A and BCD. 226 00:14:06,860 --> 00:14:10,555 And now is the recursion on lines we in effect undo the mergers. 227 00:14:10,555 --> 00:14:15,058 So for each merge we do a corresponding split of the meta node and we replace 228 00:14:15,058 --> 00:14:18,869 that with an internal node. And then two children corresponding to 229 00:14:18,869 --> 00:14:21,871 the symbols that were merged to make that meta node. 230 00:14:21,871 --> 00:14:26,375 So for example, when we want to undo the merge of the B and the CD, we take the 231 00:14:26,375 --> 00:14:28,799 node labeled BCD and we split it into two. 232 00:14:28,799 --> 00:14:32,440 The left, left child being B, the right child being CD. 233 00:14:32,440 --> 00:14:37,826 So the original outer most call, what it gets from its rehersive call is this tree 234 00:14:37,826 --> 00:14:42,227 with three nodes and it has to undue it's merger, it merged C and D. 235 00:14:42,227 --> 00:14:46,826 So it takes the leaf labeled with symbol CD and splits it into two. 236 00:14:46,826 --> 00:14:51,752 It replaces it with a new unlabeled internal node, left child C, right child 237 00:14:51,752 --> 00:14:53,854 D. So how does the algorithm work in 238 00:14:53,854 --> 00:14:59,110 general? Well just as you'd expect given the discussion on this concrete example. 239 00:14:59,110 --> 00:15:03,360 I'm going to take as the base case when the alphabet has just two symbols, in 240 00:15:03,360 --> 00:15:07,840 that case the only thing to do is encode one with a zero, the other with a one. 241 00:15:10,020 --> 00:15:14,169 Otherwise, we take the two symbols of the alphabet that have the smallest 242 00:15:14,169 --> 00:15:17,125 frequencies. You can break ties arbitrarily, it's not 243 00:15:17,125 --> 00:15:19,740 going to matter, it's going to be optimal either way. 244 00:15:20,940 --> 00:15:26,740 We then replace these two low frequency symbols A and B with meta symbol AB, 245 00:15:26,740 --> 00:15:30,479 intended to represent both of them in some sense. 246 00:15:30,479 --> 00:15:35,668 As we just discussed with those semantics, we should be defining the 247 00:15:35,668 --> 00:15:39,866 frequency of the meta symbol AB as the sum of the frequencies of the symbols 248 00:15:39,866 --> 00:15:43,892 that it comprises. We now have a well defined, smaller sub 249 00:15:43,892 --> 00:15:46,811 problem. It has one fewer symbol than the one we 250 00:15:46,811 --> 00:15:49,974 were given. So, we can recursively compute a solution 251 00:15:49,974 --> 00:15:52,833 for it. So, what the recursive call returns is a 252 00:15:52,833 --> 00:15:56,969 tree who's leaves are in one to one correspondence with sigma prime. 253 00:15:56,969 --> 00:16:01,591 That is, it does not have a leaf labeled A, it does not have a leaf labeled B. 254 00:16:01,591 --> 00:16:06,031 It does have a leaf labeled AB, so we want to extend to this tree T prime 255 00:16:06,031 --> 00:16:09,072 to be one who's leaves correspond to all of sigma. 256 00:16:09,072 --> 00:16:13,695 And the obvious way to do that is to split the leaf labeled AB, replace that 257 00:16:13,695 --> 00:16:19,720 with a new internal unlabeled node with two children which are labeled A and B. 258 00:16:19,720 --> 00:16:24,539 The resulting tree capital T with leaves and correspondents to the original 259 00:16:24,539 --> 00:16:28,533 alphabet sigma is then the final output of Huffman's Algorithm. 260 00:16:28,533 --> 00:16:33,606 As always, with a greedy algorithm we may have intuition, it may seem like a good 261 00:16:33,606 --> 00:16:36,459 idea. But we can't be sure without a rigorous 262 00:16:36,459 --> 00:16:39,440 argument. That is the subject of the next video.