Summary: The Source Coding Theorem states that the entropy of an alphabet of symbols specifies to within one bit how many bits on the average need to be used to send the alphabet.
The significance of an alphabet's entropy rests in how we can represent it with a sequence of bits. Bit sequences form the "coin of the realm" in digital communications: they are the universal way of representing symbolic-valued signals. We convert back and forth between symbols to bit-sequences with what is known as a codebook: a table that associates symbols to bit sequences. In creating this table, we must be able to assign a unique bit sequence to each symbol so that we can go between symbol and bit sequences without error.
As we shall explore in some detail elsewhere, digital communication is
the transmission of symbolic-valued signals from one place to
another. When faced with the problem, for example, of sending
a file across the Internet, we must first represent each
character by a bit sequence. Because we want to send the file
quickly, we want to use as few bits as possible. However, we
don't want to use so few bits that the receiver cannot
determine what each character was from the bit sequence. For
example, we could use one bit for every character: File
transmission would be fast but useless because the codebook
creates errors. Shannon
proved in his monumental work what we call today the
Source Coding Theorem. Let
A four-symbol alphabet has the following probabilities.
"Electrical Engineering Digital Processing Systems in Braille."