Cryptopals: Set 1 – challenge 4

Now that we know how to crack a single-byte XOR cipher, let's expand and crack even more through Cryptopals' next challenge.

Last time, we managed to break a single-byte key XOR cipher. This might not be useful in practice, but is a necessary stepping stone to understand more advanced cryptographic attacks.

To further drive home this point, we’re going to use the same attack to target multiple potential ciphertexts all at once.

Detect single-character XOR

One of the 60-character strings in this file has been encrypted by single-character XOR.

Find it.

(Your code from #3 should help.)

Refactoring for usability

First things first, we need to update our procedural code from last time to make it a bit more reusable. This means wrapping our cracking code in functions so we can invoke it again and again:

function attemptSingleByteXor(Bytes $cipher): Bytes
{
  $candidates = [];
  for($byte = 0; $byte < 256; $byte++) {
    $key = new Bytes(str_repeat(chr($byte), $cipher->length()));

    $score = scoreEnglishPhrase($cipher->xor($key)->asRaw());

    $candidates[$byte] = $score;
  }

  // Sort the most likely candidates first
  arsort($candidates);

  $best = array_key_first($candidates);
  $key = new Bytes(str_repeat(chr($best), $cipher->length()));

  return $cipher->xor($key);
}

This function will always return a Bytes object that’s a proper candidate for a decrypted plaintext. However, that doesn’t mean that it’s always returned something we can read. We can test things against yesterday’s exercise pretty easily:

$cipher = Bytes::fromHex('1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736');

echo attemptSingleByteXor($cipher)->asRaw();

But once we load up the full list of candidate strings, we might get gibberish in response.

The full exercise

First, we’ll need to pull in our list of hex-encoded strings. Then, one at a time, we’ll pass them through our cracking function to try to get some intelligible response.

To ensure things are intelligible, we’ll pass the resulting raw text again through scoreEnglishPhrase(). Then we’ll collect an array mapping Bytes to their scores so we can check (again naively) for the highest scoring candidate.

$candidates = [];
$fp = fopen('ciphers.txt', 'r');
if ($fp) {
  while (($buffer = fgets($fp, 4906)) !== false) {
    $line = trim(preg_replace('/\s+/', '', $buffer));
    $cipher = Bytes::fromHex($line);

    $plaintext = attemptSingleByteXor($cipher)->asRaw();

    $candidates[$plaintext] = scoreEnglishPhrase($plaintext);
  }

  fclose($fp);
}

// Sort the most likely candidates first
arsort($candidates);

$best = array_key_first($candidates);
echo $best;

Now that we’ve iterated across our entire file of candidates, we’ve discovered Now that the party is jumping is the hidden plaintext.