x86 – Insignificant Bits

Here’s my methodology for part 1 of the Can You Crack It puzzle.

(Spoilers below)

eb 04 af c2 bf a3 81 ec  00 01 00 00 31 c9 88 0c
0c fe c1 75 f9 31 c0 ba  ef be ad de 02 04 0c 00
d0 c1 ca 08 8a 1c 0c 8a  3c 04 88 1c 04 88 3c 0c
fe c1 75 e8 e9 5c 00 00  00 89 e3 81 c3 04 00 00
00 5c 58 3d 41 41 41 41  75 43 58 3d 42 42 42 42
75 3b 5a 89 d1 89 e6 89  df 29 cf f3 a4 89 de 89
d1 89 df 29 cf 31 c0 31  db 31 d2 fe c0 02 1c 06
8a 14 06 8a 34 1e 88 34  06 88 14 1e 00 f2 30 f6
8a 1c 16 8a 17 30 da 88  17 47 49 75 de 31 db 89
d8 fe c0 cd 80 90 90 e8  9d ff ff ff 41 41 41 41

Anyone who has stared long and hard at x86 hexdumps before will immediately think “I know this, this is an intel system!” The value 0xdeadbeef in little-endian format is a dead-giveaway, as are the 0x90 (NOP) instructions. I know a couple of ways to go from a block of machine code to the corresponding code. One way, like scripts/decodecode in the kernel, is to make a .S file with .byte directives, assemble it, and run objdump over the object file.

Here’s another way, and what I did at first: create a .c file like so, and compile it:

$ cat foo.c

unsigned char x[] = { /* list of hexes here */ };
int main()
{
}

$ gcc -g -o foo foo.c

Then, load it up in gdb and disassemble as if it were a function:

$ gdb foo

gdb> disas x

That procedure yielded some interesting bits of hand-coded assembly, but at this point I had no idea what it did. I cleaned up the code a bit and added labels to arrive at a listing like the following:

    jmp l1
    scas %es:(%edi), %eax
    ret $0xa3bf

    l1:
    sub $0x100, %esp
    xor %ecx, %ecx

    loop1:
    mov %cl,(%esp,%ecx,1)
    inc %cl
    jne loop1

    xor %eax, %eax
    mov $0xdeadbeef, %edx

    loop2:
    add (%esp,%ecx,1),%al
    add %dl,%al
    ror $0x8, %edx
    mov (%esp,%ecx,1),%bl
    mov (%esp,%eax,1),%bh
    mov %bl,(%esp,%eax,1)
    mov %bh,(%esp,%ecx,1)
    inc %cl
    jne loop2
    jmp l2

    func1:
    mov %esp, %ebx
    add $0x4, %ebx
    pop %esp
    pop %eax
    cmp $0x41414141,%eax
    jne quit
    pop %eax
    cmp $0x42424242,%eax
    jne quit
    pop %edx
    mov %edx,%ecx
    mov %esp,%esi
    mov %ebx,%edi
    sub %ecx,%edi
    rep movsb %ds:(%esi),%es:(%edi)
    mov %ebx,%esi
    mov %edx,%ecx
    mov %ebx,%edi
    sub %ecx,%edi
    xor %eax,%eax
    xor %ebx,%ebx
    xor %edx,%edx

    loop3:
    inc %al
    add (%esi,%eax,1),%bl
    mov (%esi,%eax,1),%dl
    mov (%esi,%ebx,1),%dh
    mov %dh,(%esi,%eax,1)
    mov %dl,(%esi,%ebx,1)
    add %dh, %dl
    xor %dh,%dh
    mov (%esi,%edx,1),%bl
    mov (%edi),%dl
    xor %bl,%dl
    mov %dl,(%edi)
    inc %edi
    dec %ecx
    jne loop3

    quit:
    xor %ebx,%ebx
    mov %ebx,%eax
    inc %al
    int $0x80

    l2:
    nop
    nop
    call func1

    .byte 0x41, 0x41, 0x41, 0x41

The first thing I looked at was the int 0x80 call. This is how (well, one way) you make a system call on Linux. The %ebx register is always zero, and %eax is one. This handy website shows that such a syscall results in a call to sys_exit(). Thus I added the label quit to this bit of code.

At a high level, we then have a loop at loop1 that initializes some data on the stack; loop2, which performs some unknown calculation on that array; func1, a function which itself performs another loop. A close inspection reveals that func1 uses the output of loop2 as an input, along with data at the end of the program, beginning with 0x41414141.

Deciphering loop2 is an interesting exercise. The initializer creates a 256-byte character array with sequential values from 0 to 255. Then it uses the value 0xdeadbeef to generate an index in %eax, which is used here:

    mov (%esp,%ecx,1),%bl
    mov (%esp,%eax,1),%bh
    mov %bl,(%esp,%eax,1)
    mov %bh,(%esp,%ecx,1)

This is a swap operation, so after 256 iterations, the array will have been permuted. This looked to me like some kind of random shuffle, with 0xdeadbeef as a seed, but I was unfamiliar with its actual purpose. I wrote a C version just to make it clearer:

int firstpart(unsigned char *x, size_t len)
{
    int i;
    unsigned char tmp, a = 0;
    uint32_t d = 0xdeadbeef;

    for (i=0; i < len; i++)
        x[i] = i;

    for (i=0; i < len; i++)
    {
        a += x[i] + d;
        d = (d >> 8) | (d << 24);
        tmp = x[i];
        x[i] = x[a];
        x[a] = tmp;
    }
}

Likewise, the func1 was doing some kind of shuffle, and xor-ing the result with another block of data. This screams crypto to me, but I still didn't know exactly what func1 was doing. I wrote a C version:

int second_part(unsigned char *x, unsigned char *y, size_t len)
{
    unsigned char a = 0;
    unsigned char b = 0;
    unsigned char tmp;
    int i;

    for (i=0; i < len; i++)
    {
        a = i+1;
        b += x[a];
        tmp = x[a];
        x[a] = x[b];
        x[b] = tmp;
        b = x[(x[a] + x[b]) & 0xff];
        y[i] ^= b;
    }
}

On the crypto hunch, I started reading random wikipedia pages, until I stumbled across the pseudocode on the RC4 page. Aha! This is RC4 with a key of 0xdeadbeef. I never guessed RC4 was so simple.

At this point, I had this whole block of code figured out, could run it fully in my C variant, but knew I needed ciphertext to go at the end of the program and didn't know where to find it. Asking the internet gave me the hint to look inside the image file with the code dump, and the rest was easy to figure out.

Solving the puzzle yields a link to a javascript page where you are to write a virtual machine and run it to reveal the next stage. I implemented the machine in Python but it still needs a bit of debugging to give up its secret.

Tag: x86

How I nearly cracked it

I can nearly crack it