Chris de la Iglesia
About Blog Github

Cryptography from the Ground Up

May 23 2021

One of the most interesting and useful things computers can do for us is cryptography. We can hide messages, validate identities, and even build entire trustless distributed systems. Cryptography not only defines our modern world, but is a big part of how we will build the world of the future.

However, unless you want to dedicate years and a PhD to studying the subject, the actual workings of cryptography can be hard to learn. It can involve a lot of pitfalls and if you dare build from scratch, you are bound to make a fool of yourself. Why?

In my opinion, it comes down to history. Cryptography has had centuries of methods that have been made, broken, and remade again. Most tutorials on cryptography focus on the what: do this, don’t do that, follow the rules. But they skip over the why: why do we do the things we do? What are we trying to avoid?

To understand the why, we need to understand how we got here in the first place. And to do that, let’s set computers to the side for the moment and delve into the world of classical cryptography.

Starting at the Top: The Caesar Cipher

APPLES ARE TASTY
BQQMFT BSF UBTUZ
CRRNGU CTG VCUVA

The Caesar Cipher is one of the oldest and simplest methods of encryption. The basic concept is incredibly simple; just take each letter and shift it up or down the alphabet by a known number of places.

For example, shifting forward by three means A becomes D, G becomes J, M becomes P. To get the original message, just move back down the alphabet by the same number. The resulting message doesn't resemble the original at all. Easy, right?

Even with this, we can start to see the outline of cryptography: the original message, or plaintext, the obfuscated resulting message, or ciphertext, and the number to shift by, or the key.

plaintext
+
key
ciphertext
ciphertext
-
key
plaintext

Interactive Caesar Cipher

Using this simple example, lets think about how to break the cipher and decrypt an unknown message. If we were given just some ciphertext, such as KJQNSJ, how might we figure out the original plaintext? Well, with only 26 possible ways of shifting letters up and down, it becomes trivial to simply try all the ways and recognize the original message.

This leads us to one of the first requirements for an effective encryption scheme: a large effective key space size. With only 26 ways of encrypting any piece of text, our key space is incredibly small, even small enough to be broken by hand. So, how can we increase the size of our key space so an attacker can’t simply try all the keys?

State of the Art (for the 16th century): The Vigenère Cipher

The Vigenère cipher is named after Blase de Vigenère, but was actually invented by Giovan Battista Bellaso in 1553 and could not be cracked for 300 years. The improvement on the Caesar cipher is relatively simple.

Instead of shifting all the letters by the same amount, why not shift each letter by a different amount? If every letter shifts by a different amount, it becomes impossible to tell what the original letter was without knowing how much it was shifted.

Unique shifts for every letter

In practice, shifting each letter by a unique amount becomes impractical. After all, if you could securely communicate how much to shift every letter in a message, you might as well just communicate the real message. So, the Vigenère Cipher goes halfway: it defines a set of shifts that repeats, encrypting the whole message with a relatively small key.

This set of shifts is represented by a simple, easy to remember word. Each letter in the word corresponds to a shift equal to that letter’s place in the alphabet. For example, the letter A means to shift by 0 (keeping the original letter), the letter D means to shift by 3, and so on.

Every letter represents a shift

If the key word is LEMON, then you would shift the first letter of the plaintext by 11, the second letter by 4, the third letter by 12, and so on. The key point here is that we repeat the keyword to encrypt every section of the plaintext. This introduces problems, as we will see later on, but greatly simplifies the encryption/decryption process.

Message being encrypted with the Vigènere Cipher

This forms a much more complex cipher with many more key options, but can still be done by hand, oftentimes with the help of a table called a Tabula Recta. You can try it for yourself here:

plaintext
+
key
ciphertext
ciphertext
-
key
plaintext

Interactive Vigenère Cipher

Now, how can we break this? In order to decrypt the message, we would need to know every letter in our key word rather than just one shift for the entire message. Our effective key space size is now 265, which is much larger than the 26 of the Caesar cipher. This can no longer be broken by hand, which is why this cipher was not broken for centuries.

So, what can we do? Now that the brute force solution is out, we need to be a bit clever about deducing what the key is. In order to do that, we need to learn about frequency analysis.

More Accurate Guesses: Frequency Analysis

Let’s go back to the Caesar cipher for a moment. Let’s say we choose not to brute force the solution, what can we do? What information can we glean just from the ciphertext?

Well, since every letter in the plaintext corresponds to a certain letter in the ciphertext (A corresponds to F, for example), the frequency of each letter is maintained.

In most languages, each letter occurs at a fairly consistent rate. For English, the distribution is seen above. If we have a large enough ciphertext, we can calculate the frequency of the letters and match them up to the frequency in normal English, and have a pretty good idea of what letters substitute for what.

Translation of frequencies from original text

In this example, I took the Apple page from Wikipedia and encrypted it with the Caesar cipher. Looking at the frequencies, we can do a bit of analysis and guess that J corresponds to E and A corresponds to F, so the Caesar shift value is probably five.

So, how does this help us with the Vigenère Cipher? At first glance, frequency analysis isn’t useful because the letters are encrypted with different shifts. This hides the frequency data of the original text because each letter can be using different shifts. Here, I encrypted the Apple Wikipedia page with the key word LEMON, and the frequencies no longer line up with English.

Vigenère Cipher key

The frequencies are distorted

However, the Vigenère cipher has a fatal flaw: the set of shifts repeats. Take our example with LEMON as the key word. The first, sixth, and eleventh letter of our plaintext are all encrypted using the letter L. Every fifth letter ends up using the same shift, which means that we essentially have a set of five Caesar ciphers.

If we know the length of the key word then we can separate out the letters into separate groups and run frequency analysis on each group to find the shift. Putting it all back together gets out our key word. Taking a look at the first group of letters, we can see that P is the most common, which is just E shifted by L.

Vigenère Cipher key

The frequency pattern returns

This does require us to find the length of the key word, but we can use separate analysis for that. To get the length, we can use Kasiski examination (named after Friedrich Kasiski, the first person to publish an attack on the Vigenère cipher).

Just by chance, the same plaintext will be encrypted with the same parts of the key word, forming repeating sections of text. By counting the distance between the sections, we know that this distance must be a multiple of the length of our key word.

Repeating sections of text

In this example, if the distance between sections is 32, then the key word much be of length 1, 2, 4, 8, 16, or 32. Multiple repeated sections give us multiple distances, further reducing the possible lengths of the key word.

None of this is guaranteed. The repeated sections could be a coincidence, the frequency analysis could be off, we might not even have enough text to make an accurate guess.

However, the point of an attack on encryption is not to break it in one step, but to make it easier to break. Simply gaining partial knowledge of the key word or the plaintext is enough to allow us to test multiple likely options and brute force the solution.

How to defend against frequency analysis

The core reason that frequency analysis works here is that every letter in the output ciphertext depends on exactly one letter in the key word and one letter in the plaintext. This makes it possible to isolate the portion of ciphertext that use the same letter from the key word and then that portion maintains the same frequency signature as the original text.

To put some theory to this, this means that the Vigenère cipher fails both confusion and diffusion. In simple terms, confusion means that every letter of the output depends multiple parts of the key and diffusion means that every letter of the output depends on many parts of the plaintext.

How do we fix this? We need to find a way for every letter of the output to depend on multiple parts of the plaintext and key. The traditional way to do this is to “mix up” the input letters in a deterministic way using a substitution-permutation network.

From Wikipedia, By GaborPete - Own work, CC BY-SA 3.0, Link

The SP-network mixes up our inputs and applies our Vigenère cipher several times, mixing up the data inbetween each application using substituion and permutation ciphers.

The substitution cipher swaps out each block of text with a new block, where changing one letter of the input block changes multiple letters of the output block. The permutation cipher then swaps those blocks around in the input, so the next application of the key will apply to different parts of the text.

This network satisfies both confusion and diffusion, because the substitution cipher causes every letter of our output to depend on multiple letters of the input and the permutation cipher causes every letter to depend on multiple parts of the key. Both of these properties help prevent frequency analysis and other forms of attacks because there is less of an obvious relation between the ciphertext and the inputs.

So, what ciphers can we use for this? One simple permutation cipher we can use is the Rail Fence cipher, which writes the message in a zigzag pattern across multiple lines and then reads them horizontally. This is easy to understand and, continuing with our theme of classical cryptography, can easily be done by hand. Unfortunately, it’s very easy to break but that will be fine as we won’t be using it alone.

Rail Fence Cipher

A simple substiution cipher we can use is just to shift every letter in the message by the letter preceding it. This means that at every step, every output letter will depend on two input letters, and since we are applying it multiple times every output letter will end up depending on many input letters. This is also fairly weak on its own, but it can be easily computed by hand.

Simple Substitution Cipher

By applying the Vigenère Cipher and our substitution and permutation ciphers in alternating patterns, we can get an encryption scheme that is much stronger than the sum of its parts. The Vigenère cipher obscures the text, and our SP-network obscures the relation between our ciphertext and the key or plaintext.

Repeated applications

Vigenère Cipher key

Frequencies from our encryption network

As we can see, our frequency curve is very flat and contains very little information. With better a better SP-network, a better cipher than Vigenère, and much larger keys we could get this much more uniform, but we’ll leave that to the computers.

Conclusion: Should we use any of this?

Hell no! One of the key takeaways from all of this is that the attacks on encryption schemes are not obvious up front, and the best way to avoid such attacks is to use well known, secure, and carefully designed algorithms written by the people who know the most about modern attacks and how to prevent them.

Doing all of this using simple encryption methods helps us learn the common ways encryption can fail, as well as ways we can think to solve those problems. However, this also means that we are missing out on many of the protections and nuances developed to protect against the many attacks on encryption.

Hopefully this has given you a glimpse of how modern cryptography came to be, and more importantly why it does some of the things it does. I have a lot of reasoning through these sort of problems, and I hope you do too.