1844 – Morse Code
: First long-distance digital communication. Each character represented by a pattern of dots and dashes.
1870s – Baudot Code
: 5-bit code used in teleprinters. Allowed automated text transmission.
1963 – ASCII
: 7-bit American Standard Code for Information Interchange. Dominant in early computing.
1991 – Unicode (UTF-8)
: Universal encoding covering all languages and symbols, backwards compatible with ASCII.
Encoding Comparison: The Letter "A"
System
Year
Encoding of "A"
Bits
Morse Code
1844
· −
Variable-length
Baudot Code
1870s
00011
5 bits
ASCII
1963
01000001
7 bits
UTF-8
1991
01000001
8 bits (same as ASCII for Latin letters)
Visual Examples
Morse code
This morse code table shows all letters, digits and some punctuation.
Much more than that is not relevant to the simple equipment used.
Morse code is send over a telegraph line using a key on one end,
and a sounder on the other. A key is basically just a switch with
a spring, a sounder is an electromagnet with a metal bar that makes a clicking
sound.
Baudot code
Baudot invented a device that looked like a 5-key piano. The timing of
pressing the keys had to be exactly right, a clicking sound would let
the user know when to release the keys. The speed at which codes could
be entered would become known as Baud Rate.
Notice that the code itself is stateful. By default, you would be
entering letters, but with the figures command, subsequent code could
represent a digit or punctuation. These type of instruction codes are
known as control code. Another example would be the bell code, that actually rang a bell.
Codes were written to a paper tape, later devices could also use paper
tape as input and have Baud Rates much higher than humans could type.
Typewriter-like devices could write text on a paper based on the paper
tape. Even later, all this functionality would be combined in a single
device called a Teletype.
Ascii
With the introduction of the 7-bit ASCII standard, now there was room for
lowercase letters and more special characters and control codes. Ascii was
the reason that the 8-bit byte became the norm for not just data communication,
but computer architecture in general.
The Ascii control codes were used both for instructing teletypes such as
shown here, as well as other types of hardware terminals. Control codes
such as Start of Heading (SOH), Start Of Text (STX), Acknowledge (ACK),
etc were used to structure messages send over a network.
Teletype Model 33
Unicode
With the globalization of the internet, a universal character set was
needed to represent all possible languages and catface emoji's. 😻
The 64-bit character set is continually growing, which proves the need to have
encodings like UTF-8 to prevent a random text to have 4x the size of Ascii
text.
UTF-8 is a superset of Ascii, so all code based on Ascii would stil work
as expected. That is, except for counting the number of characters,
because 1 character can now suddenly be multiple bytes.