Thanks to Coursera, I’ve had the opportunity to take several courses on computer science recently.[ref]My education background covers physics, math, political science, and management. Until Coursera, I had never taken any formal computer science courses in my life. It’s been eye-opening to learn the theory behind things I’ve learned first-hand in the field.[/ref] The latest course has been my favorite, as it covers the field of cryptography and includes actionable programming assignments.
Our first project was a throw-away hack of the Viginere Cipher. I’d cracked this system before, actually, but it was fun to go through an explanation of why it was so vulnerable to eavesdropping. The second project was meant to demonstrate how using the same key for a one-time pad encryption scheme multiple times was insecure. Through careful frequency analysis, and a technique called a “crib drag,” I was able to quickly and easily recover the plaintext of multiple encoded messages.
The most recent project, though, had us attacking AES encryption directly. I was a bit shocked that, not only did our approach work, but it was incredibly simple in implementation.
Padding Oracle
The principle of our attack exploited a server using 128-bit AES encryption, though the nature of the encryption wasn’t what caused the issue. Instead, it was the predictable nature of message padding (a block cipher like AES expects an integer number of blocks, so padding must be added to a message in order to make it long enough to encrypt). The server in question used PKCS #7-style padding, where the end of the message is padded with a repeating integer referencing the number of padded bytes.[ref]A message padded with 4 byes would end in [cci]\04\04\04\04[/cci] where a message with 8 padded bytes would end in [cci]\08\08\08\08\08\08\08\08[/cci].[/ref]
Neither the encryption scheme or the way it uses padding are vulnerable in and among themselves. The server implementing the encryption, though, leaked vital information about both.
The server was set up in such a way as to return useful programmatic messages to clients interacting with it across a socket. If the message was received and all was well, the server responded with [cci]1[/cci]. If the message was corrupted – it was missing information and was therefore an invalid length – the server responded with [cci]-1[/cci]. If the message contained the correct number of bytes, but was invalid, the server responded with [cci]0[/cci].
The number of times I’ve written an application, method call, or publicly-accessible web service that exhibits the same kind of behavior for development and debugging purposes is … sobering.
We were given a string of already-encrypted bytes (i.e. a message intercepted by sniffing communication with the server) and tasked with using the above server behavior to our advantage to break the system. It turns out, this was relatively easy.
A block cipher works in such a way that manipulating a byte in one block has a predictable result on another block. We were able to abuse this to discover, one byte at a time, every single byte of the short message. The process for doing so was so straightforward, I wrote a quick app in Ruby that processed the entire string for me automatically.[ref]First I cracked the message by hand. It was an arduous process, but still took less than an hour in total. The scripted method completely recovered the target plain text string in just a few minutes, revealing one byte at a time in a few seconds each.[/ref]
Attacking the way the server leaks information about itself is using a side channel to decrypt the message. It’s not an uncommon attack, but I was surprised it was so simple!
Applications in the Real World
Different components of our products can be vulnerable to similar attacks in different ways. A website returning a 4XX level error in response to badly-formed (encrypted) data versus a 2XX response is one such example. As engineers, we code things in a certain way to be helpful – if we intercept an error we want to provide feedback. Unfortunately some of that feedback might provide a useful exploit route for individuals with nefarious intent.
When it comes to usernames and passwords, I know it’s considered a vulnerability when some sites take longer or return a different message for usernames that exist but fail to match a particular password. I’ve seen backs fall prey to this – an invalid username results in a quick error; a valid username (i.e. account number) with an invalid password results in a longer feedback look and a redirect. This behaviors leaks information, and I’ve personally had my bank account hijacked due to this exact kind of behavior.
Seeing what I thought was a reasonable, well-implemented system fall prey to attack so easily has driven home the point that just about any application can be vulnerable to attack. I’ll be looking at my own code with a much more critical eye moving forward. You should as well.