My recent posting of ASCII art was intentionally subtle, perhaps overly so. If you haven’t figured it out by now, it is a C program that announces the birth of our second child. When compiled and executed, it prints:
Ian Yit-Sing Copeland
Born 5 August, 2013 at 02:49
Weight 6 lbs 11 oz
In this post, I will explain how I created it.
Like many a C practitioner, I’ve always found International Obfuscated C Code Contest entries to be at once entertaining and mystifying. While I don’t consider this effort to be near the level of IOCCC entries, I thought it might be fun to try my own IOCCC-lite ASCII art program.
I knew I wanted the program to simply print an announcement string, in a non-obvious fashion. It also had to be easy to modify the program with the actual details on the day of the delivery. Indeed, I wrote this program several weeks in advance and modified the output in only a few minutes.
Thanks to the Can You Crack It challenge, I already had an RC4-like algorithm sitting on my disk. The plan then was simple: embed the key and ciphertext in the program, and just decrypt and print it. Of course, the ciphertext would be all binary, which is hard to encode in a compact manner in the source. Thus, I decided to store the ciphertext in base32 encoding.
I actually didn’t put much effort into obfuscating the code; the main issue was getting the size of the source code into around 600 characters, and doing that involved using typedefs and one-letter variable names already. By the time that was done, I didn’t change much. Apart from variable naming, the main obfuscation step was to change the base32 mapping table to consist only of numbers and symbols so that the embedded string would itself look like code. The code otherwise is pretty straight-forward.
My starting point, which base32-decoded and decrypted some placeholder text, looked like this:
#include <string.h> #include <stdio.h> char str[] = "43(:7&!#3&@>$%^|]:&6;<7*-$}9{;!*!$5{<;-=^8]#:5<@#%#&!5!@40207#9($3&)7<$1"; int b32dec(char *in, char *out, int len) { int i; char alpha[] = "3:5[9&2^]{7}*<-8@=4(6#!>|)0+;1$%"; for (i=0; i < len; i++) { unsigned char bits = strchr(alpha, in[i])-alpha; int j = (i / 8) * 5; switch (i % 8) { case 0: out[j + 0] |= bits << 3; break; case 1: out[j + 0] |= bits >> 2; out[j + 1] |= bits << 6; break; case 2: out[j + 1] |= bits << 1; break; case 3: out[j + 1] |= bits >> 4; out[j + 2] |= bits << 4; break; case 4: out[j + 2] |= bits >> 1; out[j + 3] |= bits << 7; break; case 5: out[j + 3] |= bits << 2; break; case 6: out[j + 3] |= bits >> 3; out[j + 4] |= bits << 5; break; case 7: out[j + 4] |= bits; break; } } return i/8 * 5; } int keysched(unsigned char *x, unsigned char *key, int keylen) { int i; unsigned char tmp, a = 0; for (i=0; i < 256; i++) x[i] = i; for (i=0; i < 256; i++) { a += x[i] + key[i % keylen]; tmp = x[i]; x[i] = x[a]; x[a] = tmp; } } int crypt(unsigned char *x, unsigned char *y, int len) { unsigned char a; unsigned char b = 0; unsigned char tmp; int i; for (i=0; i < len; i++) { a = i+1; b += x[a]; tmp = x[a]; x[a] = x[b]; x[b] = tmp; y[i] ^= x[(x[a] + x[b]) & 0xff]; } } int main() { unsigned char x[256]; unsigned char key[] = "abcd"; unsigned char crypt_text[sizeof(str)] = {}; int len; len = b32dec(str, crypt_text, strlen(str)); keysched(x, key, strlen(key)); crypt(x, crypt_text, len); printf("%sn", crypt_text); }
Micro-optimizing for source code size is unusual, and in some ways backwards to optimizing for generated code. For example, this change saved a couple of characters, but in the opposite way frequently done:
- *o++ = d[0] << 3 | d[1] >> 2; + *o++ = d[0] * 8 | d[1] / 4;
Similarly, I found that combining unrelated functions was useful in eliminating the character waste of function definitions. There are likely a good deal more space-savers to be found; I quit when I got it small enough.
I experimented with a few different formats for the code. It turns out that I'm not terribly good at drawing ASCII. I abandoned a baby bottle as unrecognizable and went with a footprint after seeing this motif on some birth announcements. I hand-drew it, scanned it in, loaded as a background in an html document and then put dollar signs in the right places. Yes, cheating, but I am no artist.
$$$$ $$$$$$$$ $$$ $$$$$$$$$ $$ $$$$$ $$$$$$$$$ $$$$ $$$ $$$$$$$$$ $$$$ $$$$$$ $$ $$ $$$$ $$$$$$ $ $$$$$ $$$$$$$$$$$$$ $$$$ $$$$ $$$$$$$$$$$$$$$$$ $$$$ $$$$$$$$$$$$$$$$$$$$$ $$$$ $$$$$$$$$$$$$$$$$$$$$$$ $$ $$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$$ $$$$$$$$$$$$$$$ $$$$$$$$$$$$$ $$$$$$$$$$$$ $$$$$$$$$ $$$$
I wrote a python script to encrypt and encode the string to be embedded in the code. A second script tokenized the original C source and replaced dollar signs from the text template with code. The latter script is not perfect, but worked well enough that I could hand-edit the output in a few seconds to get something reasonable.
The gory details are available at github.
Finally, on delivery day, I discovered that WordPress mangled C code included in posts. So I simply took a screenshot of the output and posted that instead, with a link to the original text. A picture of the newborn was hidden as a hyperlink from the footprint image.
Overall, the entire process took a couple of evenings from concept to completion, and I'm quite happy with the way it turned out. And, of course, I'm ecstatic about the inspiration: our newly arrived baby boy. We are very fortunate to have both Alex and now Ian in our lives, and silly C programs and fake man pages cannot come close to capturing our joy.