Trustpilot challenge: crack AES encrypted message

17 Jul 2014

Yesterday I came across a coding challenge posted by Trustpilot, a Danish company looking to hire developers. According to the company, the regular approach of job ads results in too many unqualified applicants, so in order to submit your resume, first you must decrypt a message to learn how.

From the problem description, we see that a brute-force approach is a viable solution. Assuming each of the first six bytes of the key to be in the range zero through 16, both inclusive, the possible search space consists of 17^6 or 24,137,569 keys.

Using F#, first we need to get basic AES encryption and decryption working. Strictly speaking, the encryption function isn't required, but it helps validate the structure of the solution in that we can try out different approaches on a smaller scale first.

namespace TrustpilotChallenge

open System
open System.IO
open System.Security.Cryptography

module Cryptography =
    let encryptStringToBytes (plainText: string) key iv =
        use aes = new AesManaged(BlockSize = 128, KeySize = 256)
        let encryptor = aes.CreateEncryptor(key, iv)
        use ms = new MemoryStream()
        use cs = new CryptoStream(ms, encryptor, CryptoStreamMode.Write)
        use sw = new StreamWriter(cs)

        // by explicitly closing the stream, we force flushing its contents.
        // Otherwise, calling ToArray() will yield an empty array. The
        // C# using idiom would flush implicitly by disposing of the
        // StreamWriter instance when the using block ends, whereas F#'s 
        // use disposition happens only when we exit the function. We 
        // could've used the using function instead.

    let decryptStringFromBytes (cipherText: byte[]) key iv =
        use aes = new AesManaged(BlockSize = 128, KeySize = 256)
        let decryptor = aes.CreateDecryptor(key, iv)
        use ms = new MemoryStream(cipherText)
        use cs = new CryptoStream(ms, decryptor, CryptoStreamMode.Read)
        use sr = new StreamReader(cs)

Next comes the main part which makes use of the Cryptography module defined above. Rather than describe the code here, I've embedded comments directly in the source code elaborating on the approaches I tried. The tricky part is validating when a key is indeed the correct one.

module Program =  
    let main args =
        if args.Length <> 2 then 
            printfn "Must call TrustpilotChallenge with batch begin and end args"
            printfn "Example: TrustpilotChallenge.exe 0 3"

        let batchRange = [|Convert.ToByte(args.[0])..Convert.ToByte(args.[1])|]
        let encrypted = Convert.FromBase64String("yptyoDdVBdQtGhgoePppYHnWyugGmy0j81sf3zBeUXEO/LYRw+2XmVa0/v6YiSy9Kj8gMn/gNu2I7dPmfgSEHPUDJpNpiOWmmW1/jw/Pt29Are5tumWmnfkazcAb23xe7B4ruPZVxUEhfn/IrZPNZdr4cQNrHNgEv2ts8gVFuOBU+p792UPy8/mEIhW5ECppxGIb7Yrpg4w7IYNeFtX5d9W4W1t2e+6PcdcjkBK4a8y1cjEtuQ07RpPChOvLcSzlB/Bg7UKntzorRsn+y/d72qD2QxRzcXgbynCNalF7zaT6pEnwKB4i05fTQw6nB7SU1w2/EvCGlfiyR2Ia08mA0GikqegYA6xG/EAGs3ZJ0aQUGt0YZz0P7uBsQKdmCg7jzzEMHyGZDNGTj0F2dOFHLSOTT2/GGSht8eD/Ae7u/xnJj0bGgAKMtNttGFlNyvKpt2vDDT3Orfk6Jk/rD4CIz6O/Tnt0NkJLucHtIyvBYGtQR4+mhbfUELkczeDSxTXGDLaiU3de6tPaa0/vjzizoUbNFdfkIly/HWINdHoO83E=")
        let iv = Convert.FromBase64String("DkBbcmQo1QH+ed1wTyBynA==");

        // generate keys lazily
        let genKeys batchRange =
            seq {
                for b0 in batchRange do
                    printfn "%s: Starting batch: %A" (DateTime.Now.ToLongTimeString()) b0
                    for b1 in [0uy..16uy] do
                    for b2 in [0uy..16uy] do
                    for b3 in [0uy..16uy] do
                    for b4 in [0uy..16uy] do
                    for b5 in [0uy..16uy] do
                        yield [|b0;b1;b2;b3;b4;b5;0uy;0uy;
                                0uy;0uy;0uy;0uy;0uy;0uy;0uy;0uy|] }

        // validate key when found (need to run for yourself)
        //let key = [|1uy;2uy;3uy;4uy;5uy;6uy;0uy;0uy;
        //            0uy;0uy;0uy;0uy;0uy;0uy;0uy;0uy;
        //            0uy;0uy;0uy;0uy;0uy;0uy;0uy;0uy;
        //            0uy;0uy;0uy;0uy;0uy;0uy;0uy;0uy|]
        //let decrypted = decryptStringFromBytes encrypted key iv
        //printfn "%s" decrypted
        for key in genKeys batchRange do
            // uncomment for testing purpose
            //let testMessage = "Hello world"
            //let encryptedMessage = encryptStringToBytes testMessage key iv
            //let decryptedMessage = decryptStringFromBytes encryptedMessage key iv            

                let decrypted = decryptStringFromBytes encrypted key iv
                // an attempt assumming decrypted text contains known token
                if decrypted.ToLower().Contains("trust") 
                then printfn "Candidate key by 'trust': %A" key.[0..5]

                // an attempt using simple statistical analysis by counting the number
                // of characters whose high-order bit isn't set, which indicates a 
                // simple ASCII result. Results in too many false positives
                //let highOrderBitSet = 
                //    decrypted.ToCharArray() 
                //    |> Array.filter (fun c -> c |> byte &&& 255uy = 255uy)
                //    |> Array.length
                //if highOrderBitSet = 0 then printfn "Candidate key by no high-order bits set: %A" key.[0..5]

                // an attempt based on erronous assumption of AES symmetry such that: 
                //   decrypt(encrypt(plain text)) = plain text (correct)
                //   encrypt(decrypt(cipher text)) = cipher text (wrong)
                //let reEncrypted = encryptStringToBytes decrypted key iv
                //if reEncrypted = encrypted then printfn "Candidate key by re-encrypt: %A" key.[0..5]
            | :? CryptographicException -> ()
        Console.WriteLine("Hit any key to exit")
        Console.ReadKey() |> ignore

The console application must be run from the command-line, passing in the part of the search space to cover. The arguments passed in are the lower and upper bounds of the first byte, respectively. This way, we can easily parallelize the search by running multiple console applications. Say we have a computer with four cores, then we can issue the following four commands to (almost) evenly distribute the search:

% TrustpilotChallenge.exe 0 3
% TrustpilotChallenge.exe 4 7
% TrustpilotChallenge.exe 8 11
% TrustpilotChallenge.exe 12 16

On my laptop with four cores running at 2.4 GHz, it takes about 30 minutes to traverse the entire search space. Luckily, the current approach yields only one candidate key.

Have a comment or question? Please drop me an email or tweet to @ronnieholm.