1 00:00:00,100 --> 00:00:04,357 This video we're going to talk about Error Correcting Codes. 2 00:00:04,358 --> 00:00:10,167 How we can get around some of the errors that channels may introduce. 3 00:00:10,168 --> 00:00:16,361 So we're going to talk about one, as very simplest one called a repetition code. 4 00:00:16,361 --> 00:00:22,600 And then we're going to go on to a, much more interesting and much more useful code 5 00:00:22,600 --> 00:00:26,152 actually classic codes called Hamming codes. 6 00:00:26,152 --> 00:00:30,740 Let's go through and see what I mean by error correction. 7 00:00:30,740 --> 00:00:36,519 So, here again is the digital communication model, that we've had 8 00:00:36,519 --> 00:00:42,130 before, but there is a new thing it has been stuck in here. 9 00:00:42,131 --> 00:00:47,517 So, we've already talked about source coding. 10 00:00:47,518 --> 00:00:56,887 And how to do compression and represent the source letters efficiently with bits. 11 00:00:56,888 --> 00:01:01,972 Now what's going to happen the channel coder, is going to be placed here. 12 00:01:01,972 --> 00:01:09,480 So that it converts the, what we going to call data bits into more bits. 13 00:01:09,480 --> 00:01:15,150 They have a very special structure, so that by the time these bits arrive and 14 00:01:15,150 --> 00:01:21,090 goes out of the channel where errors can be introduced, the channel decoder can 15 00:01:21,090 --> 00:01:26,757 repair those errors. And hopefully produce something that's the 16 00:01:26,757 --> 00:01:30,589 same as the data bits produced by the source coder. 17 00:01:30,589 --> 00:01:36,215 So, this may seem like magic. How can the, we compensate for errors that 18 00:01:36,215 --> 00:01:40,334 the channel introduces after they've occurred? 19 00:01:40,334 --> 00:01:44,436 And that's the it's actually pretty straightforward to see how you can do 20 00:01:44,436 --> 00:01:47,960 that. But generating efficient error correcting 21 00:01:47,960 --> 00:01:53,188 codes is somewhat tricky. So, can we mitigate channel induced 22 00:01:53,188 --> 00:01:57,055 errors? Absolutely, and these lead to the error 23 00:01:57,055 --> 00:02:02,723 codes which are commonly know as ECC. So, the simplest error correcting code is 24 00:02:02,723 --> 00:02:05,864 the repetition code. It's pretty easy. 25 00:02:05,864 --> 00:02:15,217 So, you take for each bit reduced out of the source coder, you repeat it. 26 00:02:15,218 --> 00:02:21,037 In this case, 3 times. So, a 0 data bit becomes 3 data bits in 27 00:02:21,037 --> 00:02:33,520 what we are going to call the codeword. And if the data bit is a 1, we produce 3 28 00:02:33,520 --> 00:02:37,540 bits that are 111. We just repeat them, that's why it's 29 00:02:37,540 --> 00:02:43,140 called a repetition code. So this actually helps and in more detail 30 00:02:43,140 --> 00:02:49,391 about what's going on. Is that you have this little drawing here. 31 00:02:49,392 --> 00:02:56,727 We have a sequence of data bits that looks like I've got here a 1, 0 and a 1. 32 00:02:56,728 --> 00:03:02,180 And if we went through our tran, digital transmitter and worried about a BPSK 33 00:03:02,180 --> 00:03:06,034 baseband signal set. That might produce something that looks 34 00:03:06,034 --> 00:03:10,016 like that. Well now if you take that same sequence 35 00:03:10,016 --> 00:03:15,157 and go through the channel coder, it's going to repeat the bits. 36 00:03:15,158 --> 00:03:23,145 So now we have 111, 000,11 and 1. One of the ones got chopped off. 37 00:03:23,145 --> 00:03:29,887 And when that goes through the transmitter, we now have individual bits. 38 00:03:29,888 --> 00:03:34,097 And the correlation receiver knows those are individual bits. 39 00:03:34,098 --> 00:03:38,290 And is going to try to figure out what each individual bit is. 40 00:03:38,290 --> 00:03:43,181 So now there is a pe for, for every one of those new bits. 41 00:03:43,182 --> 00:03:51,401 All right. So this creates what is called an n1 code. 42 00:03:51,402 --> 00:03:53,648 So we're repeating the, each source bit N times. 43 00:03:53,649 --> 00:04:05,804 So for every data bit that comes in, we're producing N bits, so that's what the N 44 00:04:05,804 --> 00:04:14,241 comma 1 means. One bit in, N capital N bits, out in the 45 00:04:14,241 --> 00:04:18,795 code. Now the other thing that's important here, 46 00:04:18,795 --> 00:04:25,234 will be important in applications is this, what has, what's known as a code rate, 47 00:04:25,234 --> 00:04:28,740 which is the same as Re-expression of this. 48 00:04:28,740 --> 00:04:35,204 It says that for every bit in, I have to send capital N bits, so it has in the case 49 00:04:35,204 --> 00:04:42,072 of our repetition code it has a rate of a third which we going to see is a little on 50 00:04:42,072 --> 00:04:46,174 the low side. But, this code can correct. 51 00:04:46,175 --> 00:04:50,212 All single-bit errors. And let me show you how that works. 52 00:04:50,213 --> 00:04:54,124 And what we're going to do is code by majority vote. 53 00:04:54,124 --> 00:04:58,900 So here's the idea. And this is why the number of bits N in a 54 00:04:58,900 --> 00:05:02,736 repetition code is odd, because we're going to vote. 55 00:05:02,737 --> 00:05:11,697 So, let's say that the source bit, the data bit was a zero. 56 00:05:11,698 --> 00:05:18,220 We're sending 3 bits so there are 8 possible received codewords that could 57 00:05:18,220 --> 00:05:24,594 come out of the channel. All the bits could come out correctly and 58 00:05:24,595 --> 00:05:29,751 no changes and so we say. It's a 0, because we're going to raise our 59 00:05:29,751 --> 00:05:32,743 hands and vote. All say 0, all say 1. 60 00:05:32,743 --> 00:05:37,575 Pick the one with the most votes. So here's a case where one bit got 61 00:05:37,575 --> 00:05:40,783 flipped. And what it's going to do is still 62 00:05:40,783 --> 00:05:46,290 majority 0, so it corrects the error. So this is how you can correct errors 63 00:05:46,290 --> 00:05:50,904 after they have occurred. And that's because you just repeated the 64 00:05:50,904 --> 00:05:55,703 same thing several times. If you go down this list, you see there 65 00:05:55,703 --> 00:06:01,574 are times it's possible that 2 bits occurred in there and if the error 66 00:06:01,574 --> 00:06:06,127 correcting code is not going to be able to correct it. 67 00:06:06,128 --> 00:06:10,590 So, that is why this is a single-bit error correcting code. 68 00:06:10,590 --> 00:06:14,947 It can correct 1 bit an error but not 2 or 3. 69 00:06:14,948 --> 00:06:22,590 So that turns out to be an accomplishment. So for all the 8 possibilities of received 70 00:06:22,590 --> 00:06:30,906 codewords, we can correct one, two, three, four of those codewords and, and 71 00:06:30,906 --> 00:06:38,826 understand and then say that a 0 was actually send and we get it wrong four 72 00:06:38,826 --> 00:06:42,509 times. And same thing is going to happen when a 1 73 00:06:42,509 --> 00:06:48,294 is sent, we won't be able to correct the error for 4 of the codewords and be wrong 74 00:06:48,294 --> 00:06:54,118 for the other four. So, what's the probability now that the 75 00:06:54,118 --> 00:07:00,372 data bit is decoded wrong? So that's going to be the sum of the 76 00:07:00,373 --> 00:07:06,726 probabilities for the cases in which an error still occurs. 77 00:07:06,726 --> 00:07:12,825 And that's just how it goes up and you get that word expression. 78 00:07:12,826 --> 00:07:20,564 So the question is, do we win? Is this probability what's called a 79 00:07:20,564 --> 00:07:24,367 decoding error? We don't get the original data bit back. 80 00:07:24,367 --> 00:07:30,234 Is that probability bigger or smaller than pe. 81 00:07:30,234 --> 00:07:34,272 In other words the situation in which we sent only, only one bit which is what we 82 00:07:34,272 --> 00:07:39,217 would do if we had no error correction. It had better be smaller. 83 00:07:39,217 --> 00:07:45,480 Then it turns out it is. You can show that if pe is less then a 84 00:07:45,480 --> 00:07:55,950 half, that this quantity is always less than pe and sometimes much, much less. 85 00:07:55,950 --> 00:08:02,319 Pe is like a 1000th, 10 to the minus 3 notice this square over here, and that's 86 00:08:02,319 --> 00:08:09,389 going to be probably the dominant term and so, and a 1000th this probability here is 87 00:08:09,389 --> 00:08:16,168 roughly 3 times 10 the minus 6. So we can really reduce probability of 88 00:08:16,168 --> 00:08:23,805 error by adding error correcting bits. So lets over where the repetition code is, 89 00:08:23,805 --> 00:08:29,885 its a very simple code, not just repeat the bits, very easy to do but it has some 90 00:08:29,885 --> 00:08:33,865 downsides. One, it has a very low coding rate, 1 over 91 00:08:33,865 --> 00:08:39,715 n, and we're going to discover in just a second an error correcting code that has a 92 00:08:39,715 --> 00:08:44,370 much higher rate and still corrects all single-bit errors. 93 00:08:44,371 --> 00:08:50,695 Note that as N increases the probability of just getting 1 bit in error actually 94 00:08:50,695 --> 00:08:56,155 becomes smaller in the probability of getting 2 bits in error or 3 bits in 95 00:08:56,155 --> 00:08:59,965 error. And that's just because you keep flipping 96 00:08:59,965 --> 00:09:04,830 coins a lap you going to not just wind up with just 1 bit in error. 97 00:09:04,830 --> 00:09:09,072 That's all the repetition code can do, just correct 1 bit in there. 98 00:09:09,073 --> 00:09:13,086 So we definitely need more powerful codes. Well let's talk about one. 99 00:09:13,086 --> 00:09:17,337 Well first we need to know about what are called block codes. 100 00:09:17,338 --> 00:09:20,445 And this is a generalization of the repetition code. 101 00:09:20,445 --> 00:09:25,349 Repetition, repetition code is a special case of a block code. 102 00:09:25,350 --> 00:09:32,786 So here, what we're going to do is take in a block of K source bits at once and what 103 00:09:32,786 --> 00:09:39,480 we're going to put through the channel are N bits derived from those K. 104 00:09:39,480 --> 00:09:45,757 So, in the case of the repetition code we took in 1 bit and put out N. 105 00:09:45,758 --> 00:09:49,466 More generally we're going to try to take in more than 1 bit at a time. 106 00:09:49,466 --> 00:09:53,317 Input out again. And that's an NK code. 107 00:09:53,318 --> 00:09:57,646 And that's why I wrote before that the repetition code was an N1 code. 108 00:09:57,646 --> 00:10:02,199 Special case. And here's an example of a 7, 4 code, 109 00:10:02,199 --> 00:10:07,202 which, as you might expect, has some pretty good properties. 110 00:10:07,202 --> 00:10:11,972 We're going to see that as we go through. So, here's what it does. 111 00:10:11,972 --> 00:10:19,627 It's trying to put out a codeword of 7 bits. 112 00:10:19,628 --> 00:10:25,804 Here's a list of the bits and here's a, a mathematical description of how they are 113 00:10:25,804 --> 00:10:31,372 related to the data bits. The first 4 bits simply repeat the data 114 00:10:31,372 --> 00:10:38,416 bits, so they're exactly the same. And then, for bits 5, 6, and 7, they are 115 00:10:38,416 --> 00:10:46,702 given by, it turns out, sums of the data bits, of particular data bits. 116 00:10:46,702 --> 00:10:53,056 Now what that o, o with a circle around it, plus with the circle around it, means, 117 00:10:53,056 --> 00:10:59,220 it means exclusive or in, it's, it's usually addition and binary arithmetic 118 00:10:59,220 --> 00:11:06,584 when there's no carats. So you, 0, exclusive or 0 is 0. 119 00:11:06,584 --> 00:11:14,442 1 exclusive or 1 is 0. It's only when the 2 bits disagree, do you 120 00:11:14,442 --> 00:11:18,120 get a 1. And so, you may have 3 bits here. 121 00:11:18,120 --> 00:11:24,143 They get "added together" but the result is only going to be 1 bit. 122 00:11:24,143 --> 00:11:27,540 And so, no matter what the data bits are, you just apply these formulas. 123 00:11:27,540 --> 00:11:34,049 And that derives the 3 bits we get you might say, extra. 124 00:11:34,049 --> 00:11:38,140 We have 4 data bits coming through as they are. 125 00:11:38,140 --> 00:11:44,557 And we derive three more bits from those. Turns out these are good bits to derive, 126 00:11:44,557 --> 00:11:49,104 it turns out. So we gotta figure out the, how do we, can 127 00:11:49,104 --> 00:11:55,836 we think, how can we prove that the code I just wrote, that 7, 4 code, actually can 128 00:11:55,836 --> 00:12:00,219 correct errors. So we need to go and, and figure out a 129 00:12:00,219 --> 00:12:06,762 little bit more about error correcting codes and how to think about them. 130 00:12:06,762 --> 00:12:11,201 And the key to that is something called the Hamming Distance. 131 00:12:11,201 --> 00:12:17,626 So Hamming Richard Hamming was a Bell Labs engineer, and developed a whole sequence 132 00:12:17,626 --> 00:12:21,397 of things. Hamming Distance, and it turns out that 7, 133 00:12:21,397 --> 00:12:24,900 4 code is a special case of the Hamming code family. 134 00:12:24,900 --> 00:12:32,874 So he worked on this a lot. So, we're in to find the distance between 135 00:12:32,874 --> 00:12:43,144 two codewords as being the sum, and I really mean sum here, of c1 exclusive over 136 00:12:43,144 --> 00:12:48,400 c2 so you're going to get N bits out of this. 137 00:12:48,400 --> 00:12:55,330 The sum, what that's going to be, is the, it means how many ones are there in that 138 00:12:55,330 --> 00:12:58,280 result. So you exclusive or these two things 139 00:12:58,280 --> 00:13:01,410 together, and then you tell me how many 1s there are. 140 00:13:01,410 --> 00:13:05,802 That's what sum means. Sum means think about them as a real, as 141 00:13:05,802 --> 00:13:11,157 actual 1 and 0, not binary numbers. So let's take an example. 142 00:13:11,158 --> 00:13:15,019 Suppose I have two codewords and I just pulled these out of the air. 143 00:13:15,019 --> 00:13:18,650 I want you to note that they differ in those 2 bits. 144 00:13:18,650 --> 00:13:23,719 Okay? So I claim these two codewords are a 145 00:13:23,719 --> 00:13:30,911 distance to a part, they differ in 2 bits. Now, mathematically we can see that if I 146 00:13:30,911 --> 00:13:37,917 add these two together and you can test this for yourself, and the result is this. 147 00:13:37,918 --> 00:13:43,816 And remember the exclusive or gives you a one only if the bits disagree. 148 00:13:43,816 --> 00:13:48,721 That's the key. And of course the number of bits you have 149 00:13:48,721 --> 00:13:52,202 in there is 2. And so that's, we know the distance 150 00:13:52,202 --> 00:13:58,720 between these codewords is 2. Now, I want point out something that's 151 00:13:58,720 --> 00:14:06,197 going to be useful later on we have this relationship here. 152 00:14:06,197 --> 00:14:13,607 If I take this and add it c1, I will give c2 and I will also find c1 is equal to c2 153 00:14:13,607 --> 00:14:25,715 plus this binary sequence. This is going to be useful later on. 154 00:14:25,716 --> 00:14:35,564 Okay. Let's see why this is important. 155 00:14:35,564 --> 00:14:42,526 So distance is a the Hamming distance, it says how far apart 2 codewords are and how 156 00:14:42,526 --> 00:14:46,834 many bits have to be changed for 1 to equal to the other. 157 00:14:46,834 --> 00:14:53,658 So, how far apart do two codewords have to be so that in spite of the fact that 158 00:14:53,658 --> 00:15:01,689 there's a single-bit error, I actually can figure out what the original codeword was? 159 00:15:01,689 --> 00:15:07,426 How far do they have to be apart? And another way of saying that is, when 160 00:15:07,426 --> 00:15:14,576 can a single-bit error in one codeword not be confused as a single-bit error in 161 00:15:14,576 --> 00:15:17,980 another codeword? That would be a disaster. 162 00:15:17,980 --> 00:15:22,006 If we couldn't figure this out, and if a single-bit error occurred, we wouldn't 163 00:15:22,006 --> 00:15:26,276 know what the original codeword was, which means we wouldn't know the original data 164 00:15:26,276 --> 00:15:30,452 sequence. So they have to be far enough apart that 165 00:15:30,452 --> 00:15:35,614 1, an error in 1 codeword is not confused with another codeword. 166 00:15:35,614 --> 00:15:39,598 So I have a little geometric picture to show you, so we can figure this out. 167 00:15:39,598 --> 00:15:43,880 So, sort of abstractly, we have here two codewords. 168 00:15:43,880 --> 00:15:47,632 We have c1 and c2. And of course, they're different. 169 00:15:47,632 --> 00:15:53,302 Now suppose I have a single-bit error in c1. 170 00:15:53,302 --> 00:16:00,486 So I'm using e1 to represent some particular single-bit error, which means 171 00:16:00,486 --> 00:16:05,300 this e sequence has a bunch of 0s with a 1 in one position. 172 00:16:05,300 --> 00:16:11,780 That's what the error decision names, and that gives, once I add these two together 173 00:16:11,780 --> 00:16:18,438 it gives me, let's say that. Well, suppose I had, it was possible that 174 00:16:18,438 --> 00:16:26,950 c2, if I add to it another single-bit error sequence, give me at the same, same 175 00:16:26,950 --> 00:16:33,588 place. So these sums the number of 1s in e1 and 176 00:16:33,588 --> 00:16:39,866 e2 is each one. How far apart are c1 and c2? 177 00:16:39,866 --> 00:16:45,655 Well, because I can get to c2 by taking c1. 178 00:16:45,656 --> 00:16:51,387 Adding e1 and then if I add in e2, that's going to get me over here. 179 00:16:51,388 --> 00:16:59,059 That means because these two each have a bit of bond that when I exclusive or them 180 00:16:59,059 --> 00:17:04,184 together I get 2 bits and the distance between them is 2. 181 00:17:06,050 --> 00:17:09,897 This would be a disaster in terms of pair correction. 182 00:17:09,898 --> 00:17:17,536 This is not a codeword, in general and it's 1 bit away from c1, it's 1 bit away 183 00:17:17,536 --> 00:17:21,987 from c2. I can't tell if c1 or c2 is the actual 184 00:17:21,987 --> 00:17:24,364 codeword. I'm in big trouble. 185 00:17:24,365 --> 00:17:31,043 This kind of code, if it does not have a adding distance of greater than 2, I 186 00:17:31,043 --> 00:17:36,270 cannot correct single-bit errors. So, what distance is good enough? 187 00:17:36,270 --> 00:17:42,129 I think it's pretty clear that we need to have a minimum distance between any 188 00:17:42,129 --> 00:17:46,100 possible pair of codewords that's greater than 3. 189 00:17:46,100 --> 00:17:50,797 If it's greater than 3 that's fine. But 3s the threshold. 190 00:17:50,798 --> 00:17:56,698 And the idea is that if we have one codeword here, and another codeword there, 191 00:17:56,698 --> 00:18:01,460 and an error takes us up here. Well an error might take this up here. 192 00:18:01,460 --> 00:18:05,153 And since these, let's say are at least the distance one apart. 193 00:18:05,154 --> 00:18:12,528 At least, then they won't be confused, and I can get back to the original codeword 194 00:18:12,528 --> 00:18:19,687 and correct that single bit error. So, this is an important criterion for 195 00:18:19,687 --> 00:18:25,374 having a single-bit error correcting code. And guess what? 196 00:18:25,375 --> 00:18:29,877 The repetition code satisfies that property. 197 00:18:29,878 --> 00:18:36,867 So there's only 2 codewords, because it only takes in one bit at a time. 198 00:18:36,868 --> 00:18:43,619 And so, we add together those codewords. We get 111 and the sum of those means 199 00:18:43,619 --> 00:18:48,476 there's three 1s. So sure enough, this satisfies our 200 00:18:48,476 --> 00:18:53,012 criterion for a single-bit error correcting code. 201 00:18:53,013 --> 00:19:00,010 Well, as it stands, I would have to find the distance between all possible pairs 202 00:19:00,010 --> 00:19:06,899 and an error correcting code and see what their distance is between them. 203 00:19:06,900 --> 00:19:11,310 Well it turns out, because of the properties of block codes, there's a very 204 00:19:11,310 --> 00:19:15,932 efficient way of doing it, which actually provides some insight in what's going on. 205 00:19:15,932 --> 00:19:23,498 But to do so I'm going to have to re-express these codes in terms of matrix 206 00:19:23,498 --> 00:19:26,815 notation. So we're going to take a little notational 207 00:19:26,815 --> 00:19:33,167 diversion if you will. So, we're going to say that the generation 208 00:19:33,167 --> 00:19:42,622 of a codeword from a data block, a bit, a block of data bits, is given by this 209 00:19:42,622 --> 00:19:48,409 matrix equation c equals Gb. And G here stands for generator. 210 00:19:49,550 --> 00:19:55,741 This is called the generator matrix. And in the case of the repetition code 211 00:19:55,741 --> 00:20:02,843 there's only 1 bit, cause it's an n1 code, there's only 1 bit taken in at a time its, 212 00:20:02,843 --> 00:20:09,172 it, it puts out 3 bits and the G that looks like that means simply repeat. 213 00:20:09,172 --> 00:20:14,679 If you think about it for a second what does this mean for the case of something 214 00:20:14,679 --> 00:20:18,754 like this when you have 1. And it means, just simply give it back. 215 00:20:18,754 --> 00:20:26,111 Well, lets look at our 7, 4 code which is a bit more complicated. 216 00:20:26,111 --> 00:20:29,829 So here's that code that I wrote before, same thing. 217 00:20:29,830 --> 00:20:39,185 And so now it's taking in a block of 4 data bits, putting out a block of 7 coded 218 00:20:39,185 --> 00:20:47,564 data bit, coded bits. And what G do I need to express this? 219 00:20:47,564 --> 00:20:53,525 And I claim it's pretty easy to do. So because this code just repeats the 220 00:20:53,525 --> 00:20:59,406 original data bits, the upper part of this matrix is the identity matrix. 221 00:20:59,406 --> 00:21:05,657 I, capital I, because that means just repeat the bits. 222 00:21:05,658 --> 00:21:10,534 The hard part, if you will, is down here but that's pretty easy to do. 223 00:21:10,534 --> 00:21:16,276 You simply put a 1 in the generator matrix for the bits you want in the position of 224 00:21:16,276 --> 00:21:22,090 the bits you want to select in your sum. So I've got three 1's and then a 0 that 225 00:21:22,090 --> 00:21:27,507 means data bits 1, 2 and 3. And the next one says, take data bits 2, 3 226 00:21:27,507 --> 00:21:30,666 and 4. Sure enough that's what it is. 227 00:21:30,666 --> 00:21:35,000 Take data bits 1, 2 and 4, and that's what that is. 228 00:21:35,001 --> 00:21:41,777 This is a very concise way, this generator matrix is a very concise way of writing 229 00:21:41,777 --> 00:21:46,470 down what the code is. How it operates on the data bits. 230 00:21:46,470 --> 00:21:50,196 It's not clear you actually implement it this way, but it's a nice mathematical way 231 00:21:50,196 --> 00:21:56,138 of doing it as you can see in a second. And the reason it's important is because 232 00:21:56,138 --> 00:22:02,817 this code has really cool properties. In fact it's linear. 233 00:22:02,818 --> 00:22:07,185 Really important here. So, because this is just a matrix 234 00:22:07,185 --> 00:22:13,481 multiplication, this code is linear. Suppose I add together two data bits, 235 00:22:13,481 --> 00:22:17,313 sequences. Well that's gotta be some other databit 236 00:22:17,313 --> 00:22:20,248 sequence. Who knows what it is, but it's something. 237 00:22:20,248 --> 00:22:26,068 Well, you can know this is exclusive or still obeys the distribution laws. 238 00:22:26,069 --> 00:22:36,884 This is Gb1, exclusive or Gb2. Well by definition, that's c1 and that's 239 00:22:36,884 --> 00:22:42,535 c2. So, it obeys the superposition principle. 240 00:22:42,536 --> 00:22:46,054 And that has all kinds of interesting consequences. 241 00:22:46,055 --> 00:22:56,350 Because, since b1 plus b2 is some other data bit sequence. 242 00:22:56,350 --> 00:23:00,570 In our 7, 4 code these are each 7, 4 bit entries. 243 00:23:00,570 --> 00:23:04,610 We exclusive or those 4 bit blocks together we get some other bit block. 244 00:23:04,610 --> 00:23:07,464 Well that has to be some other data bit sequences. 245 00:23:07,464 --> 00:23:15,232 That turns out to be c3 which means it's a codeword. 246 00:23:15,232 --> 00:23:24,795 So adding together two codewords results in another codeword. 247 00:23:24,795 --> 00:23:30,998 Well, that means, that the distance between any two codewords has got to be 248 00:23:30,998 --> 00:23:35,250 the same as the number of bits in another codeword. 249 00:23:35,250 --> 00:23:42,648 So, we only need to examine individual codewords to figure out the code's error 250 00:23:42,648 --> 00:23:48,274 correcting capability. We just need to examine how many 1s are 251 00:23:48,274 --> 00:23:55,554 there in every codeword and that tells us what the Hamming distances are between all 252 00:23:55,554 --> 00:24:00,085 possible pairs. So, that's why I wanted to go into matrix 253 00:24:00,085 --> 00:24:04,847 notation to prove this linearity property. It's really important. 254 00:24:04,848 --> 00:24:09,990 So there's my generator matrix for my 7, 4 code I want to point out something. 255 00:24:09,990 --> 00:24:16,867 Suppose I have a 1 bit sequence, data bit sequence. 256 00:24:16,868 --> 00:24:27,870 What's the resulting codeword for that? What you should find is that the resulting 257 00:24:27,871 --> 00:24:32,947 codeword for that is the first column of G. 258 00:24:32,948 --> 00:24:42,596 If you have another simple bit data block with the 1 in the second position, that 259 00:24:42,596 --> 00:24:52,814 corresponds to the second column etc. So staring me in the face, in the columns 260 00:24:52,814 --> 00:24:58,898 of G are 4 codewords. So all I have to do is count up the number 261 00:24:58,898 --> 00:25:03,950 of bits in each column and sure enough, I'm a happy guy. 262 00:25:03,950 --> 00:25:09,449 They're all greater than 3, well that's four of the codewords. 263 00:25:09,450 --> 00:25:18,868 Next thing I need to do is add together two columns to create more codewords. 264 00:25:18,868 --> 00:25:24,184 Well, I want you to note because of the construction of this code you know that 265 00:25:24,184 --> 00:25:30,070 the upper part is going to have 2 bits. As long as every row, every pair here 266 00:25:30,070 --> 00:25:35,937 differs in at least 1 bit. I know that it's going to give me one down 267 00:25:35,937 --> 00:25:38,862 here, I've got two up here, so I have at least three. 268 00:25:38,862 --> 00:25:44,053 And if you look at the way this bottom part was constructed, you look at all 269 00:25:44,053 --> 00:25:48,160 pairs they differ in the bottom part in at least one place. 270 00:25:48,160 --> 00:25:57,413 So, that means all pairs have 3 bits have the number of bits of 3. 271 00:25:57,413 --> 00:26:02,145 And now if you add three of them together you can see from the top you're going to 272 00:26:02,145 --> 00:26:06,170 get a 1 in all cases. So that gives me a link to 3. 273 00:26:06,171 --> 00:26:15,152 It doesn't matter what happens down below. So this is a single-bit error correcting 274 00:26:15,152 --> 00:26:18,745 code. And that's because all the codewords have 275 00:26:18,745 --> 00:26:22,550 at least three 1s. The fact that one of them has 4 really 276 00:26:22,550 --> 00:26:27,062 doesn't help. It's the minimum of all these distances 277 00:26:27,062 --> 00:26:33,250 that matters for being able to have that single-bit error correcting code. 278 00:26:33,250 --> 00:26:40,564 So, and I want to point out as opposed to the repetition code, it's efficiency is 279 00:26:40,564 --> 00:26:46,223 much higher. Repetition code that we talked about at a, 280 00:26:46,223 --> 00:26:51,120 a code rate of a third, low, 1 bit in 3 bits out. 281 00:26:51,120 --> 00:26:55,000 Now it's 4 bits in 7 bits out. Much higher efficiency. 282 00:26:55,000 --> 00:27:01,988 And it too can correct single-bit errors just as well as the repetition code. 283 00:27:01,988 --> 00:27:09,775 So, error-correcting codes. Even though the channel introduces errors, 284 00:27:09,775 --> 00:27:14,254 the error correcting code can repair some of the errors. 285 00:27:14,254 --> 00:27:19,198 If more than one bit occur in error our single bit error correcting codes can't 286 00:27:19,198 --> 00:27:23,024 correct them. But at least all single bit error errors 287 00:27:23,024 --> 00:27:26,868 can be corrected. And I showed you with the repetition code 288 00:27:26,868 --> 00:27:31,097 that lowers the probability of the received bits being an error. 289 00:27:31,098 --> 00:27:35,317 This is very different than in analog communication. 290 00:27:35,318 --> 00:27:41,293 In that case, once the signal comes out of the channel, there's nothing you can do 291 00:27:41,293 --> 00:27:47,256 other than remove out of bound noise, but you cannot remove the out of bound noise 292 00:27:47,256 --> 00:27:52,118 by linear filtering you are stuck with. You can't do anything with it. 293 00:27:52,118 --> 00:27:56,940 Here, we're removing fixing errors and lowering the probability. 294 00:27:56,940 --> 00:28:03,040 We're essentially making the channel less noisy by adding error correcting bits. 295 00:28:03,040 --> 00:28:07,744 Now we must sent bits at a higher rate than required by the source. 296 00:28:07,745 --> 00:28:13,581 But efficient codes, i.e. Ones having a, a rate that's close to one 297 00:28:13,582 --> 00:28:18,305 are fairly easily designed. And we'll see that in a succeeding video. 298 00:28:18,306 --> 00:28:24,306 However, do want to point out that error correcting code have limited capabilities. 299 00:28:24,307 --> 00:28:30,404 We can't fix double bit errors, in fact for the case of repetition code and the 300 00:28:30,404 --> 00:28:36,410 Hamming code that I've showed you that if a double bit error occurs, we're going to 301 00:28:36,410 --> 00:28:41,643 think a single bit error occurred and we're going to correct it badly. 302 00:28:41,643 --> 00:28:48,104 We're going to make at any mistake, and so longer codewords double-bit errors occur 303 00:28:48,104 --> 00:28:53,237 more frequently than single-bit errors and that can be a problem. 304 00:28:53,238 --> 00:28:57,867 So in the next video I'm going to talk about how you actually do the error 305 00:28:57,867 --> 00:28:59,985 correction for a Hamming code.