Step 0: Just Don’t Code Your Own Crypto
Really, don’t do. At least that’s what people say. There are so many little things you’ll miss. Not to mention that ECB’s deterministic nature makes it awful to use for cryptography (so if you do roll your own, don’t do it with AES-ECB). But, if you must do it (like I did), you’ll learn quite a lot of things in the process. And truthfully, it is a bit cool to have a working implementation of a crypto standard that’s not just a wrapper of OpenSSL a la Ruby, etc. The reward is intangible! But I felt all warm & fuzzy inside.
Step 1: Learn All The Things.
Implementing AES is a relatively simple process that combines over a dozen relatively simple processes, which makes it, in fact, not quite so simple as one thought at the start. AES is a block encryption algorithm known as an “iterated block cipher”. It uses a fixed block size and the same key for both encryption and decryption. The key is expanded for each round of encryption to form an “Expanded Key”. In each round, a slice is taken from the Expanded Key and only used for this round.
It’s sort of an advancement of repeated-key XOR. What’s different? To name a few:
- there are multiple rounds of XORing the key with the plaintext
- there are other bit scrambling processes besides XOR to make the message harder to decrypt, like:
- mixing columns
- shifting rows
- multiplication in a Galois field
- bitwise transformation using lookup tables
- the key doesn’t repeat: an “expanded key” is produced long enough for a fresh slice in each round
I won’t go into every step of the implementation. The following resources proved invaluable.
- NIST standard
- good for validating expanded keys
- every step laid out, easy to follow pdf
- good for coding individual functions
- step-by-step for a single message
- good for validating each and every step of your functions for a 16-byte input (contains all the intermediate state matrices)
- stick figure guide
Step 2: Some Key Concepts
I created lots of helper functions to solve these language idiosyncrasies and to keep the format of my data the same when it was passed between functions. I made the following design decisions:
- Represent everything as hexadecimal values internally, padded two two characters per hex value
- Unless performing an operation on the 4x4 state matrix, keep the state as a simple 1xNUMBER_BYTES array
- Create functions that can be reused for both encryption and decryption with, for instance, a flag (example: shiftRows slices “forward” or “backward”) based on whether we are encrypting or decrypting
- Minimize the number of conversions between a flat byte array and a 4x4 state matrix
- Try to encapsulate functionality and ensure reuse (e.g., my padding implementation of PKCS7 can be generalized to any block size, not just 16 bytes)
Step 3: Code All The Pieces
Go through the docs and simply code the pieces one-by-one. You can look at a bit of my code to see how. Most of the pieces are very straightforward. One thing that is conceptually a bit confusing at first is that when multiplying numbers together, you are using a finite field. This means that there are a finite number of elements, so multiplication does not give the results you are used to seeing with good old decimal numbers in the reals.
While you don’t need to know the maths in detail to code the multiplication (you can just code in the lookup tables to translate hex values for encryption and decryption), it helps to understand what’s going on conceptually. It’s easy enough: do a lookup of both numbers on one table, sum the results, if the sum is greater than 255, wrap around, then look up the result on the results table.
Step 3: Validate Your Results
Using step-by-step for a single message, walk through your encryption and decryption by logging out your state matrix at every step for a 16-byte encryption round. Decryption is just encryption steps in reverse, so walk backwards. Make sure your expanded key is correct for every round.
To validate your padding, you can use Ruby’s openssl wrapper. There’s a simple script in my repo, but you can also do it using the command line.
At first I couldn’t figure out how Ruby was padding, but it turned out to be simple PKCS7.
Above is my implementation. You can see some of the helper functions going on here (hexPad to make sure each hex character is length==2). I generalized the padding behavior in
padToN to pad any array with any number of characters. I then used this to pad out the key used if it is fewer than the number of bits of encryption, or to slice it if it is greater than the number of bits of encryption.
PKCS7 padding is simple to understand. If the size of the message is not an integer multiple of the block size, bytes are added until it becomes an integer multiple. The value of the bytes added is equal to the number of bytes that are added. So if the message is three bytes short of an integer multiple (say, 13 bytes), three 0x03 bytes are added to the end of the block.
On the other hand, if the message is an integer multiple of the block size, an entire block of 0x15 is added (16 0x15 characters) to the end of the message. Why? In order to signify that the end of the message has been reached in encryption. It’s a sort of checksum (when you hit this repeated block, you know you’ve received the entire message). In the decryption process, the extra characters can be chopped off. I left them on in my demo for demonstration purposes.
Step 4: WRITE TESTS
I wrote a full test suite for all my functions using Chai and Mocha. Here’s a great link on how to use them in the browser, independent from NPM.
You absolutely must write tests, or debugging will become hopelessly complex.
Write tests before you write functions.
Make sure the individual functions work before plugging them together. Make sure to double-check the format and data type of each input and output when you are plugging functions together (is that an array, or a string? is that in hex or binary or text or base64?).
A few things not to forget (you will, and then you’ll kick yourself later):
Encryption vs decryption: you’ll likely use the same named functions, but there will be some sort of flag to specify which of the two you’re performing. Don’t forget to set this flag.
The block size is 16 bytes. The block size is 16 bytes. It does not change no matter what the number of bytes in the KEY. The key becomes longer, but it still operates on groups of 16 bytes. More rounds take place, which is why the key must be longer.
Don’t forget that in 32-bit mode, the extended key has a slightly different step: there is an extra subWord on the EK req’d every 8th round, starting @ round 12
Step 5: Now Code It Yourself (or don’t!)