sae.py

For a while now, I’ve wanted an easy way to decrypt a mesh packet capture when I know the SAE passphrase. This would be quite handy for debugging.

I reasoned that, if I knew the shared secret, and had captured the over-the-air exchange, it should be possible to make such a tool, right? We know everything each station knows in that case, don’t we?

So I spent a bit of time last week reimplementing SAE in python to come to grips with its various steps. And I found my reasoning was flawed.

Similarly to Diffie-Hellman, SAE exchanges values that include the composition of a random value (we’ll call them r1 and r2) with some public part P, in such a way that it is hard to extract r1 even if you know P (e.g. r1*P and r2*P with ECC, although this is not exactly how SAE is specified). These values can be used by each peer to arrive at a shared secret provided they know the original random number (e.g., something like r1 * r2 * P). The crucial point is that r1 and r2 cannot be determined from the exchange alone. They exist only in memory on each peer during the exchange.

So if the secrecy doesn’t actually depend on the password, what is it there for? The answer is authentication: SAE is a password authenticated key exchange (PAKE). The password ensures that the party we are talking to is who we think it is, or at least someone who also knows the password. In the case of SAE, P is derived from the password.

As for the original goal, what we can do instead is use the CCMP keys from a wpa_supplicant log and decrypt frames with those. That is not nearly as convenient, though.

And thus my postings for 2015 are at an end; happy new year, all!

Now (mostly) encrypted

I signed up for the let’s encrypt beta and got my cert setup today. I let my old CA-purchased cert lapse when it came for re-up a couple of years ago, so happy there’s soon to be a free alternative.

Enjoy the HTTPS! (Except for some stray mixed content I need to kill — sorry for that.)

Footprint, Part 2

My recent posting of ASCII art was intentionally subtle, perhaps overly so. If you haven’t figured it out by now, it is a C program that announces the birth of our second child. When compiled and executed, it prints:

Ian Yit-Sing Copeland
Born 5 August, 2013 at 02:49
Weight 6 lbs 11 oz

In this post, I will explain how I created it.

Like many a C practitioner, I’ve always found International Obfuscated C Code Contest entries to be at once entertaining and mystifying. While I don’t consider this effort to be near the level of IOCCC entries, I thought it might be fun to try my own IOCCC-lite ASCII art program.

I knew I wanted the program to simply print an announcement string, in a non-obvious fashion. It also had to be easy to modify the program with the actual details on the day of the delivery. Indeed, I wrote this program several weeks in advance and modified the output in only a few minutes.

Thanks to the Can You Crack It challenge, I already had an RC4-like algorithm sitting on my disk. The plan then was simple: embed the key and ciphertext in the program, and just decrypt and print it. Of course, the ciphertext would be all binary, which is hard to encode in a compact manner in the source. Thus, I decided to store the ciphertext in base32 encoding.

I actually didn’t put much effort into obfuscating the code; the main issue was getting the size of the source code into around 600 characters, and doing that involved using typedefs and one-letter variable names already. By the time that was done, I didn’t change much. Apart from variable naming, the main obfuscation step was to change the base32 mapping table to consist only of numbers and symbols so that the embedded string would itself look like code. The code otherwise is pretty straight-forward.

My starting point, which base32-decoded and decrypted some placeholder text, looked like this:

#include <string.h>
#include <stdio.h>

char str[] = "43(:7&!#3&@>$%^|]:&6;<7*-$}9{;!*!$5{<;-=^8]#:5<@#%#&!5!@40207#9($3&)7<$1";

int b32dec(char *in, char *out, int len)
{
    int i;
    char alpha[] = "3:5[9&2^]{7}*<-8@=4(6#!>|)0+;1$%";
    for (i=0; i < len; i++)
    {
        unsigned char bits = strchr(alpha, in[i])-alpha;
        int j = (i / 8) * 5;
        switch (i % 8) {
            case 0:
                out[j + 0] |= bits << 3;
                break;
            case 1:
                out[j + 0] |= bits >> 2;
                out[j + 1] |= bits << 6;
                break;
            case 2:
                out[j + 1] |= bits << 1;
                break;
            case 3:
                out[j + 1] |= bits >> 4;
                out[j + 2] |= bits << 4;
                break;
            case 4:
                out[j + 2] |= bits >> 1;
                out[j + 3] |= bits << 7;
                break;
            case 5:
                out[j + 3] |= bits << 2;
                break;
            case 6:
                out[j + 3] |= bits >> 3;
                out[j + 4] |= bits << 5;
                break;
            case 7:
                out[j + 4] |= bits;
                break;
        }
    }
    return i/8 * 5;
}


int keysched(unsigned char *x, unsigned char *key, int keylen)
{
    int i;
    unsigned char tmp, a = 0;

    for (i=0; i < 256; i++)
        x[i] = i;

    for (i=0; i < 256; i++)
    {
        a += x[i] + key[i % keylen];
        tmp = x[i];
        x[i] = x[a];
        x[a] = tmp;
    }
}

int crypt(unsigned char *x, unsigned char *y, int len)
{
    unsigned char a;
    unsigned char b = 0;
    unsigned char tmp;
    int i;

    for (i=0; i < len; i++)
    {
        a = i+1;
        b += x[a];
        tmp = x[a];
        x[a] = x[b];
        x[b] = tmp;
        y[i] ^= x[(x[a] + x[b]) & 0xff];
    }
}

int main()
{
    unsigned char x[256];
    unsigned char key[] = "abcd";
    unsigned char crypt_text[sizeof(str)] = {};
    int len;

    len = b32dec(str, crypt_text, strlen(str));
    keysched(x, key, strlen(key));
    crypt(x, crypt_text, len);
    printf("%sn", crypt_text);
}

Micro-optimizing for source code size is unusual, and in some ways backwards to optimizing for generated code. For example, this change saved a couple of characters, but in the opposite way frequently done:

-            *o++ = d[0] << 3 | d[1] >> 2;
+            *o++ = d[0] * 8 | d[1] / 4;

Similarly, I found that combining unrelated functions was useful in eliminating the character waste of function definitions. There are likely a good deal more space-savers to be found; I quit when I got it small enough.

I experimented with a few different formats for the code. It turns out that I'm not terribly good at drawing ASCII. I abandoned a baby bottle as unrecognizable and went with a footprint after seeing this motif on some birth announcements. I hand-drew it, scanned it in, loaded as a background in an html document and then put dollar signs in the right places. Yes, cheating, but I am no artist.

                                          $$$$
                                        $$$$$$$$
                              $$$      $$$$$$$$$
                        $$   $$$$$    $$$$$$$$$
                       $$$$   $$$     $$$$$$$$$
                       $$$$             $$$$$$
                  $$   $$
                 $$$$            $$$$$$
             $  $$$$$       $$$$$$$$$$$$$
           $$$$  $$$$    $$$$$$$$$$$$$$$$$
           $$$$       $$$$$$$$$$$$$$$$$$$$$
           $$$$     $$$$$$$$$$$$$$$$$$$$$$$
            $$    $$$$$$$$$$$$$$$$$$$$$$$$$
                $$$$$$$$$$$$$$$$$$$$$$$$$$$
               $$$$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$
               $$$$$$$$$$$$$
               $$$$$$$$$$$$
                 $$$$$$$$$
                    $$$$

I wrote a python script to encrypt and encode the string to be embedded in the code. A second script tokenized the original C source and replaced dollar signs from the text template with code. The latter script is not perfect, but worked well enough that I could hand-edit the output in a few seconds to get something reasonable.

The gory details are available at github.

Finally, on delivery day, I discovered that WordPress mangled C code included in posts. So I simply took a screenshot of the output and posted that instead, with a link to the original text. A picture of the newborn was hidden as a hyperlink from the footprint image.

Overall, the entire process took a couple of evenings from concept to completion, and I'm quite happy with the way it turned out. And, of course, I'm ecstatic about the inspiration: our newly arrived baby boy. We are very fortunate to have both Alex and now Ian in our lives, and silly C programs and fake man pages cannot come close to capturing our joy.

How I nearly cracked it

Here’s my methodology for part 1 of the Can You Crack It puzzle.

(Spoilers below)

eb 04 af c2 bf a3 81 ec  00 01 00 00 31 c9 88 0c
0c fe c1 75 f9 31 c0 ba  ef be ad de 02 04 0c 00
d0 c1 ca 08 8a 1c 0c 8a  3c 04 88 1c 04 88 3c 0c
fe c1 75 e8 e9 5c 00 00  00 89 e3 81 c3 04 00 00
00 5c 58 3d 41 41 41 41  75 43 58 3d 42 42 42 42
75 3b 5a 89 d1 89 e6 89  df 29 cf f3 a4 89 de 89
d1 89 df 29 cf 31 c0 31  db 31 d2 fe c0 02 1c 06
8a 14 06 8a 34 1e 88 34  06 88 14 1e 00 f2 30 f6
8a 1c 16 8a 17 30 da 88  17 47 49 75 de 31 db 89
d8 fe c0 cd 80 90 90 e8  9d ff ff ff 41 41 41 41

Anyone who has stared long and hard at x86 hexdumps before will immediately think “I know this, this is an intel system!” The value 0xdeadbeef in little-endian format is a dead-giveaway, as are the 0x90 (NOP) instructions. I know a couple of ways to go from a block of machine code to the corresponding code. One way, like scripts/decodecode in the kernel, is to make a .S file with .byte directives, assemble it, and run objdump over the object file.

Here’s another way, and what I did at first: create a .c file like so, and compile it:

$ cat foo.c

unsigned char x[] = { /* list of hexes here */ };
int main()
{
}

$ gcc -g -o foo foo.c

Then, load it up in gdb and disassemble as if it were a function:

$ gdb foo

gdb> disas x

That procedure yielded some interesting bits of hand-coded assembly, but at this point I had no idea what it did. I cleaned up the code a bit and added labels to arrive at a listing like the following:

    jmp l1
    scas %es:(%edi), %eax
    ret $0xa3bf

    l1:
    sub $0x100, %esp
    xor %ecx, %ecx

    loop1:
    mov %cl,(%esp,%ecx,1)
    inc %cl
    jne loop1

    xor %eax, %eax
    mov $0xdeadbeef, %edx

    loop2:
    add (%esp,%ecx,1),%al
    add %dl,%al
    ror $0x8, %edx
    mov (%esp,%ecx,1),%bl
    mov (%esp,%eax,1),%bh
    mov %bl,(%esp,%eax,1)
    mov %bh,(%esp,%ecx,1)
    inc %cl
    jne loop2
    jmp l2

    func1:
    mov %esp, %ebx
    add $0x4, %ebx
    pop %esp
    pop %eax
    cmp $0x41414141,%eax
    jne quit
    pop %eax
    cmp $0x42424242,%eax
    jne quit
    pop %edx
    mov %edx,%ecx
    mov %esp,%esi
    mov %ebx,%edi
    sub %ecx,%edi
    rep movsb %ds:(%esi),%es:(%edi)
    mov %ebx,%esi
    mov %edx,%ecx
    mov %ebx,%edi
    sub %ecx,%edi
    xor %eax,%eax
    xor %ebx,%ebx
    xor %edx,%edx

    loop3:
    inc %al
    add (%esi,%eax,1),%bl
    mov (%esi,%eax,1),%dl
    mov (%esi,%ebx,1),%dh
    mov %dh,(%esi,%eax,1)
    mov %dl,(%esi,%ebx,1)
    add %dh, %dl
    xor %dh,%dh
    mov (%esi,%edx,1),%bl
    mov (%edi),%dl
    xor %bl,%dl
    mov %dl,(%edi)
    inc %edi
    dec %ecx
    jne loop3

    quit:
    xor %ebx,%ebx
    mov %ebx,%eax
    inc %al
    int $0x80

    l2:
    nop
    nop
    call func1

    .byte 0x41, 0x41, 0x41, 0x41

The first thing I looked at was the int 0x80 call. This is how (well, one way) you make a system call on Linux. The %ebx register is always zero, and %eax is one. This handy website shows that such a syscall results in a call to sys_exit(). Thus I added the label quit to this bit of code.

At a high level, we then have a loop at loop1 that initializes some data on the stack; loop2, which performs some unknown calculation on that array; func1, a function which itself performs another loop. A close inspection reveals that func1 uses the output of loop2 as an input, along with data at the end of the program, beginning with 0x41414141.

Deciphering loop2 is an interesting exercise. The initializer creates a 256-byte character array with sequential values from 0 to 255. Then it uses the value 0xdeadbeef to generate an index in %eax, which is used here:

    mov (%esp,%ecx,1),%bl
    mov (%esp,%eax,1),%bh
    mov %bl,(%esp,%eax,1)
    mov %bh,(%esp,%ecx,1)

This is a swap operation, so after 256 iterations, the array will have been permuted. This looked to me like some kind of random shuffle, with 0xdeadbeef as a seed, but I was unfamiliar with its actual purpose. I wrote a C version just to make it clearer:

int firstpart(unsigned char *x, size_t len)
{
    int i;
    unsigned char tmp, a = 0;
    uint32_t d = 0xdeadbeef;

    for (i=0; i < len; i++)
        x[i] = i;

    for (i=0; i < len; i++)
    {
        a += x[i] + d;
        d = (d >> 8) | (d << 24);
        tmp = x[i];
        x[i] = x[a];
        x[a] = tmp;
    }
}

Likewise, the func1 was doing some kind of shuffle, and xor-ing the result with another block of data. This screams crypto to me, but I still didn't know exactly what func1 was doing. I wrote a C version:

int second_part(unsigned char *x, unsigned char *y, size_t len)
{
    unsigned char a = 0;
    unsigned char b = 0;
    unsigned char tmp;
    int i;

    for (i=0; i < len; i++)
    {
        a = i+1;
        b += x[a];
        tmp = x[a];
        x[a] = x[b];
        x[b] = tmp;
        b = x[(x[a] + x[b]) & 0xff];
        y[i] ^= b;
    }
}

On the crypto hunch, I started reading random wikipedia pages, until I stumbled across the pseudocode on the RC4 page. Aha! This is RC4 with a key of 0xdeadbeef. I never guessed RC4 was so simple.

At this point, I had this whole block of code figured out, could run it fully in my C variant, but knew I needed ciphertext to go at the end of the program and didn't know where to find it. Asking the internet gave me the hint to look inside the image file with the code dump, and the rest was easy to figure out.

Solving the puzzle yields a link to a javascript page where you are to write a virtual machine and run it to reveal the next stage. I implemented the machine in Python but it still needs a bit of debugging to give up its secret.

I can nearly crack it

The GCHQ (that’s British for NSA) has been running a marketing gimmick to find new people to read your tweets. On this website, you will find an enigmatic hexdump, and a prompt for a keyword. Supposedly if you get it correct, then you get forwarded to their job site. I don’t care about the job aspect, but I do enjoy a good puzzle.

I’ll add my details and methodology in a follow-up post after the clock runs out, so as not to spoil anything for casual readers who want to take their own look. There are a few good clues in the hexdump if you have previously spent any time looking at real ones.

I did (almost) crack it: I figured out all there was to know about the hexdump. It became clear at that point that the hexes on the website aren’t everything you need for the puzzle, so I cheated to find the missing piece (somewhat obvious, in retrospect). A neat exercise, and I learned something about cryptography.

Now I’m on the second stage, which is a bit more straightforward in that the problem statement tells you what type of answer is expected.

I’m curious to know if this type of recruitment tool is actually useful in finding qualified applicants, or just in generating buzz. Certainly several companies I’ve worked at have had their share of recruiting woes, and dreaming up a set of screening puzzles has to be more fun than dealing with headhunters.

New GPG key

My previous key isn’t yet two years old, but given the recent kernel.org reboot, I created a new keypair today. The fingerprint is:

11FB F4D3 92A6 F32B CB4E  2E07 8411 1007 23B2 B915

The old key will be valid for a little while longer. As for korg, perhaps it’s time I used github anyway. My OMFS tree is over there now, lonely and waiting for me to feed him some patches.