Origins of Cryptography
All of the preceding sections have led us to what is perhaps the most important component of creating secure environments. This is the study and application of methods to transform information by a secret function so that it will be extremely difficult for an attacker to be able to interpret the message in question.
The approach that is central to this and any course on cybersecurity is known as cryptology, which is in effect a combined term stemming from the study of cryptography (from the Greek Kpottro [krypto or hidden] and ypa<|>ia [gratia or writing]). The second component of cryptology is the field of cryptanalysis or the study of ways of breaking cryptographic methods. Both fields have an extremely long history far predating the computer era. In fact, one of the earliest cryptographic methods is said to have been invented by Julius Caesar over 2000 years ago.
Caesar Shift
In the Caesar shift, as well as in any cryptographic method or algorithm, we have a text that is required to be hidden, called the plaintext. We also have a number of different possible ways of transforming that text, each one of those ways determined by the choice of the transformation that we call the key. Once the transformation determined by the key is applied to the plaintext, the result is referred to as the encrypted text or the ciphertext.
In the example of the Caesar shift, suppose that we wish to transmit the message: “ROME WAS NOT BUILT IN A DAY.” And, for this transmission today, we choose as the key the number 3, by which we mean that each letter in the message will be advanced by three positions in the alphabet—in the case of letters near the end of the alphabet, if we advance beyond the letter Z, we continue back to the beginning or A. Thus, our plaintext message becomes:
ROME WAS NOT BUILT IN A DAY
SPNF XBT OPU CVJMU JO B EBZ
TQOG YOU PQV DWKNV KP C FCA
URPH ZDV QRW EXLOW LQ D GDB
So, the ciphertext to be transmitted to the receiver is “urph zdv QRW EXLOW LQ D GDB".
In order for any cryptographic method to be useful, there must be an inverse to the key in order to allow the receiver to apply that inverse key and thus retrieve the original message. In the case of the Caesar shift, the inverse key is the inverse of the value of the key in arithmetic modulo 26—or, in other words, the number that when added to the key (3) adds to 26, in other words 3 + 23 = 0 (mod 26).
Thus, the method of decrypting the encrypted text is to use the same process, in this case by advancing each letter 21 times. Of course, the equivalent is to back each letter in the encrypted text up three times. Again, we use the “cyclic” alphabet: after Z we go to A.
In order to use this or any other cryptographic method, we require a method for formulating plaintext; a number of possible transformations, each defined by a key, and a system or alphabet for the ciphertext. Finally, we require that the choice of the key or transformation be kept secret, but somehow it must also be shared between the sender and the receiver of the encrypted message so that the receiver can apply the inverse key in order to retrieve the original plaintext.
In the case of the Caesar shift, one failing is that there are a maximum of 26 possible keys, one for each letter of the alphabet, so for the attacker trying to break the Caesar shift, it is only necessary to try all 26 potential keys, and one of them will produce the desired result.
Substitution and Transposition
Most cryptographic methods over time have relied on two philosophically different approaches to the transformation of information. These are called substitution and transposition. The Caesar example cited earlier uses the principle of substitution; in other words, each character in the plaintext is transformed according to a function of the character symbol itself. In the case of our aforementioned example with the key “5,” the 26 letters of the alphabet are transformed as follows: A -> F. B -> G, C — H, ..., Z E.
The other approach, transposition, which we will see later, performs the transformation on the position of the symbol in the plaintext list. A very simple example might be described as (1 2 3 4) -> (3 4 2 1), by which we mean that the first of a four-letter group moves to the third position, the second to the fourth, and so on. Thus, in this case, the message “EACH GOOD DEED PAYS” becomes "HCEA DOGO DEDE SYPA.” The inverse key is therefore (1 2 3 4) -»(4 3 1 2).
The Keyword Mixed Alphabet Cryptosystem
The keyword mixed alphabet uses for its keys the set of all words, with duplicate letters removed, in the English language. Indeed, the requirement for words to be English words is imposed only because the distribution of the keyword is much simpler if it is a word rather than an arbitrary character string.
The messenger, having ridden from Lexington to Valley Forge in half a day, was exhausted, out of breath, and indeed near death as he approached General Washington to tell him the secret key for the cipher system: “XRUTGDKWQFP”, he panted, then expired. Did he say “XRUTGDKWQFP" or “XRUTGDKWQFT"? puzzled General Washington.
The method itself uses the keyword to define a mapping or permutation of the message space alphabet. To define the key, the alphabet is written in its normal order, and under it is written a permuted alphabet, with the letters of the keyword followed by the remaining letters of the alphabet. For the moment, consider only that we are using just the 26 uppercase characters.
Suppose the keyword is: FACETIOUSLY
Then the regular alphabet is (with the spaces inserted just for readability):
ABCDE FGHIJ KLMNO PQRST UVWXY Z
and the permuted alphabet is:
FACET IOUSL YBDGH JKMNP QRVWX Z
Written for ease of reading, we have:
ABCDE FGHIJ KLMNO PQRST UVWXY Z
FACET IOUSL YBDGH JKMNP QRVWX Z
The encryption maps each letter of the message text to a letter of cipher text according to the permutation defined earlier. Thus, “MAY
THE FORCE BE WITH YOU” becomes “DFX PUT IHMCT AT VSPU XHQ,” or, more likely, “DFXPU TIHMC TATVS PUXHQ.” It is common practice to group the text in equal size blocks. On the one hand, the spaces allow for easier human reading, but more importantly, the normal position of the blanks or spaces tells us the word lengths.
The Vigenère Cryptosystem
The Vigenere cipher was a widely used cryptosystem dating back to the sixteenth century, using a keyword combined with a Caesar shift. If the keyword is “FACETIOUSLY,” an 11-letter word, as before, the encryption will use 11 different Caesar shifts periodically. (Each letter determines a Caesar shift, or modular addition.) Suppose that 0*^ A, 1 B, ..., 25 «-> Z, as usual. Then, the first letter to be encoded uses the shift corresponding to F, the second to A, the third to C, and so on until the cycle repeats:
Choose a key word, perhaps:
Plain text:
IT'S A LONG WAY TO TIPPERARY ...
Key:
FAC E TIOU SLY FA CETIOUSLY
Cipher text:
OUV F FXCB PMX ZP WNIYTMTDX
One-time Pad Encryption
The “one-time pad” plays an important role in the history of encryption. It was first described in 1882 by Frank Miller.
The principle of the one-time pad is an elaboration on the Vigenere cryptosystem as described earlier. In the Vigenere system, the key word is chosen (FACETIOUSLY in our example), and then encryption consists of a different Caesar shift based on the corresponding element in the pad corresponding to the plaintext letter in the same position. However, the breaking or cryptanalysis of the Vigenere method is based on the realization that after a certain number of plaintext letters are encrypted, the period related to the key may be determined. In the case of the example with keyword FACETIOUSLY, the same Caesar shift will take place with characters in positions 1, 12, 23, 34, and so on: in other words, every 11th letter.
The concept with the one-time pad is that the length of the key word can be stretched to an enormous length. Indeed, for the true one-time pad, the keyword should be of infinite length. However, the creation of an infinite-length string that does not repeat is practically impossible. But it is possible using randomization techniques to generate long sequences of letters that may only repeat after thousands or indeed millions of terms in a sequence.
Long after one-time pads began to be used in military applications, it was proved by Claude Shannon of Bell Laboratories in 1949 (Shannon, 1949) that the one-time pad could be proved in its original description as the most secure encryption possible, despite the fact that the true one-time pad of infinite length is essentially impossible to construct.
For example, one possible one-time pad might be: |
SISAVCUIIDFZD |
MMMYGLXSKGKSI |
JCAHOOFNCWFLV |
LZRMGYVGIEDQA |
OXJKIYWGENDID |
V W... |
So, let us use this quote from Albert Einstein to encrypt: |
The difference between stupidity and genius is that genius has its limits.
THE DIFFERENCE BETWEEN STUPIDITY AND GENIUS
IS THAT GENIUS HAS ITS LIMITS
THEDI IUSIS |
FFERE THATG |
NCEBE ENIUS |
TWEEN HASIT |
STUPI SLIMI |
DITYA TS |
NDGEN |
SISAV WFLVL |
CUI ID ZRMGY |
FZDMM VGIED |
MYGLX QAOXJ |
SKGKS KIYWG |
IJCAH EN |
OOFNC |
DIDVW (etc.)
Using the correspondence A «-> 1, B «-> 2, C «-> 3,..., Z <-> 26, add the value for each plaintext character to the value of the corresponding element of the pad (in mod 26, or subtracting 26 if the sum is >26). The encryption of these characters is the character corresponding to the sum.
T -» 20, S -» 19; |
20 |
+ |
19 = 39; |
remainder mod 26 |
||||
13, |
13 -» M |
|||||||
H -* |
8, I -» |
9; |
9 |
+ |
9 = 17; |
remainder |
mod |
26 |
17, |
17 -» Q |
|||||||
E -» |
5, S -* |
19; |
5 |
+ |
19 = 24; |
remainder |
mod |
26 |
24, |
24 -» X |
And so on.
The Playfair Square
Here is a more sophisticated method from the nineteenth century known as the Playfair Square. To create a Playfair Square, we will build a 5x5 square using the regular English alphabet (Patterson 1987).
Whoops! The alphabet has 26 letters and yet 5x5 = 25.
Our task is to reduce the alphabet to 25 letters, which we accomplish by merging two letters together, I and J. So, in the algorithm, in the plaintext, every J is changed to an I.
Now pick a word to form the key. If the keyword has a duplicate, for example, suppose the keyword is CONSPIRATORIALLY, then by deleting, we will use CONSPIRATLY as the key.
Now create the Playfair Square by writing the letters of the key into the 5x5 square in row' order. Then fill in the rest of the (25-letter) alphabet, also in row' order. The Playfair Square encryption method is based not on substituting single letters (therefore 26 different substitutions) when taking letters in pairs (or digrams). There are 26x25 = 650 digrams; therefore, the total number of possible permutations of those 650 symbols is a number approximately 8.1 x 10^{1547}, that is, greater than a 1 followed by 1547 zeros. In this case, the Playfair Square is:
c o n s p I R A T L Y B D E F G H K M Q U V W X Z
We will refer to this square as row 1, “CONSP”; Column 1, “CIYGU”; and so on.
The 5x5 square is essentially the key, although all that has to be shared with the receiver is the keyword, CONSPIRATORIALLY in this case (Figure 8.1).
Now to encrypt a message, the square is based on the symbols being not letters but pairs of letters. We first break the plaintext message into digrams. For example, the plaintext message COMMITTEE becomes:
CO MM IT TE EZ (if the plaintext has an odd number of letters, just attach any letter at the end).
Encrypt rule 1: If the digram is in the same row or column, shift each character to the right (for a row) or down (for a column); thus CO-► ON. IT-> RL.
Encrypt rule 2: If the digram defines a rectangle, take the other corners of the rectangle, in the order of the row of the first message character; thus, for example, EZ -> FX.
But, for our keyword CO MM IT TE EZ. how do we deal with MM? The answer is: we don’t. We preprocess the message by inserting a letter (Q is best) between every pair. Thus, actually COMMITTEE becomes COMQMITQTEQE or CO MQ MI TQ TE QE. Thus, we have ensured that in the text to be encrypted, we will have no double letters and thus either rule 1 or rule 2 will apply. The encryption of the new plaintext becomes
Encryption: |
Rule: |
Location: |
C0->0N |
Rule 1 |
Row 4 |
MQ->QG |
Rule 2 |
Row 1 |
MI->GT |
Rule 2 |
Rows 4,1 |
TQ^LM |
Rule 2 |
Rows 2,4 |
TE->EM |
Rule 1 |
Column 4 |
QE->MF |
Rule 2 |
Rows 4,3 |
Now you can send the message. The receiver has the same square and has two encrypt rules, rule 1 going left if the pair is in the same row, going up if they are in the same column, and the rectangle rule 2 is the same as for encryption.
Finally, as the receiver has the decrypted but slightly altered message, the receiver pulls out all the Qs inside pairs and interprets “I” as “J” where necessary.
When we change all Js into Is, this will almost never confuse issues in the decryption. And, with respect to the inserted Qs between double letter pairs, can this ever become a problem? Since in English, a Q is almost always followed by a U, this will almost never create a problem—except possibly for VACUUM.
Rotor Machines
Early in the twentieth century, machine production of ciphers became possible. To mechanize the production of ciphertext, various devices were invented to speed the process. All important families of such devices were rotor machines, developed shortly after the time of World War I, to implement Vigenere-type ciphers with very long periods.
A rotor machine has a keyboard and a series of rotors. A rotor is a rotating wheel with 26 positions—not unlike the workings of an odometer in an automobile. Each position of the rotor wheel completes an electric contact and, depending upon the position, determines a different Caesar shift. When a key on the keyboard is depressed, a letter is generated dependent on the position of the rotors.
World War II and the Enigma Machine
A US patent for a rotor machine was issued in 1923 to Arthur Scherbius (IT History Society, 2018). It essentially formed the basis for the German Enigma machine. A variation on the rotor machine design was developed by the German Armed Forces leading up to World War II and called the Enigma machine (Figure 8.1). The now well-documented British success at breaking the code at Bletchley Park was a major factor in the war effort. In 2014, the movie The Imitation Game depicted reasonably well the efforts led by Alan Turing in breaking the Enigma code (Sony Pictures Releasing, 2014). Turing, considered by many the father of computer science, died tragically in 1954 at age 41.
Problems
- 1. Find two strings of the greatest length that make sense in English and are related by a Caesar shift. (Example: t4(CAP) = GET. Thus, 11 (BZO)=CAP and 15(BZO) = GET. Ignore blanks.) If the “CAP/GET” example is length 3, the record in our classes is length 7.
- 2. What is unique about the keyword FACETIOUSLY?
- 3. Decrypt the following cipher using Caesar’s shift:
RVPXR WPCXV JTNQR VJWXO ONAQN LJWCA NODBN
4. Using the (1 2 3 4 5) -> (4 3 5 1 2), encrypt "FOURSCORE AND SEVEN YEARS AGO."
Figure 8.1 The Enigma machine orrectshown at the Swiss Transport Museum. (Creative Commons: Enigma_Verkehrshaus_luzerne.jpg licensed with CC-by-sa-3.0.)
5. Decrypt this message that uses a transposition cipher:
RTEH IANI ANPS SUN NMIA NLIY PTEH NLIA
6. Using the Vigenere cipher method with a ten-letter keyword of your choice, encrypt the message:
I have a dream that one day on the red hills of Georgia the sons of former slaves and the sons of former slave owners will be able to sit down together at the table of brotherhood.
- 7. Create a Playfair Square with keyword IMPOSSIBLE and encrypt the message JUSTICE EVENTUALLY WINS OUT
- 8. Biliary and Hill use a Playfair cipher to prevent their communications from being intercepted by an intruder, Trumpy. They have agreed on a common key, from the word “ALGEBRAICALLY.”
a. Construct their jointly used Playfair Square.
A |
L |
G |
E |
B |
R |
I |
C |
Y |
D |
F |
H |
K |
M |
N |
O |
P |
Q |
s |
T |
U |
V |
w |
X |
z |
b. Biliary receives the following Playfair-encrypted message from Hill:
RP GX HI PG BS SY RT RV TO SO
ML AT SP QW MG SG VH KB
Find Biliary’s decryption.
References
IT History Society. 2018. Dr. Arthur Scherbius Bio/Description. https:// www.ithistory.org/honor-roll/dr-arthur-scherbius.
Patterson, W. 1987. Mathematical Cryptology. Rowman and Littlefield, Totowa, NJ. 318 p.
Shannon, C. 1949. Communication theory of secrecy systems. Bell System Technical Journal, 28(4). 656-715.
Sony Pictures Releasing. 2014. The Imitation Game (film).