On board the rebus

My most recent post, for those too lazy to follow the link and hit the “Reveal All” button, is a crossword puzzle announcing the birth of our third son, Samuel Yit-Mun Copeland. We’re over the moon with this new guy in our family. Here he is nearly asleep in my arms as I type this.

But now for something more boring.

I decided months ago that my typical geeky announcement would take the form of a crossword since I’ve been playing in that area recently (previous announcements: a manpage for Alex, an ascii-art crypto program for Ian). And why not make this puzzle interesting by constructing my first rebus puzzle?

A rebus puzzle is one in which a single square can have multiple letters. Part of the fun of solving is realizing at some point that an answer must be right, yet it cannot fit the available cells, so something really weird is going on. Then one curses the constructor and starts with a new set of assumptions.

Constructing such a puzzle is not too much harder than any ordinary puzzle, since all the rebus squares will be part of the themers that the constructor selects up front. Because of this, the fill software doesn’t even really need to care about rebuses — one can use any placeholder for the rebus squares and then edit the final crossword file to add in the full string.

Having lots of theme squares does constrain the grid somewhat, as does having a number of circles on top of that (in this case the circles spell out our new son’s name). But I cheated a little bit: I decided having any old separation between circles was fine as long as the components were all on the same line, and his name is short so that was no major obstacle. In the end I’m pretty happy with the fill considering the 55 theme squares + 20 circles.

Each time I make a puzzle, I also make a number of improvements to my filling software. I’ve long since forgotten whether the goal is building puzzles or building a puzzle builder.

I mentioned before that I had optimized the javascript filler to the point that it was potentially usable in a webapp; this webapp is now a thing. For this puzzle, I used that app as well as my interactive python command-line filler, which has some additional features like estimating the number of eventual solutions using more levels of look-ahead. Finally, I also used a version of the python filler that will take a completed grid and try to optimize it by repeatedly clearing sections of the grid and refilling it randomly.

There is still a good deal of performance work to do on the filler itself. The basic structure is still similar to the backtracking algorithm in QXW: bitmaps are used to filter available words for each cell, and we fill each entry starting with the entries with fewest fills. I made two minor changes in my version: the word list is initially sorted by score and indexed on length so that we can try highest scoring words first and eliminate comparisons against words that will never fit; and entire entries are filled instead of single cells so that score of an entire word is maximized.

I implemented my current algorithm from scratch in C, and it clocks in under half a second for my test puzzle: a little faster than QXW’s implementation and some 30x faster than the very same algorithm in javascript. In terms of LOC, they are roughly the same: 573 (python), 696 (javascript), 763 (C). You can watch them compete below.

So the javascript implementation is frightfully slow. Getting it down to the level of, say, two times the C code, would be a major win, but beyond my current level of js optimization know-how.

Besides that, there is likely some algorithmic optimization that could bring larger gains. Knowing that the “crossword fill problem” is an instance of SAT, I went off to read what Knuth had to say about it (Vol 4 fascicles 5-6), and, as usual, came away with a lot more to think about. The “word square” section is very similar to crossword filling, and indeed the algorithm is close to my existing search, except that it uses trie data structures to cut down on the number of possible words visited at each level. It’s unclear to me at present whether tries would be a win here given that we don’t always start filling with a prefix or suffix, but it could be worth an (ahem) try. Also I may look at branch-and-bound search pruning and parallel search at some point.

Footprint, Part 2

My recent posting of ASCII art was intentionally subtle, perhaps overly so. If you haven’t figured it out by now, it is a C program that announces the birth of our second child. When compiled and executed, it prints:

Ian Yit-Sing Copeland
Born 5 August, 2013 at 02:49
Weight 6 lbs 11 oz

In this post, I will explain how I created it.

Like many a C practitioner, I’ve always found International Obfuscated C Code Contest entries to be at once entertaining and mystifying. While I don’t consider this effort to be near the level of IOCCC entries, I thought it might be fun to try my own IOCCC-lite ASCII art program.

I knew I wanted the program to simply print an announcement string, in a non-obvious fashion. It also had to be easy to modify the program with the actual details on the day of the delivery. Indeed, I wrote this program several weeks in advance and modified the output in only a few minutes.

Thanks to the Can You Crack It challenge, I already had an RC4-like algorithm sitting on my disk. The plan then was simple: embed the key and ciphertext in the program, and just decrypt and print it. Of course, the ciphertext would be all binary, which is hard to encode in a compact manner in the source. Thus, I decided to store the ciphertext in base32 encoding.

I actually didn’t put much effort into obfuscating the code; the main issue was getting the size of the source code into around 600 characters, and doing that involved using typedefs and one-letter variable names already. By the time that was done, I didn’t change much. Apart from variable naming, the main obfuscation step was to change the base32 mapping table to consist only of numbers and symbols so that the embedded string would itself look like code. The code otherwise is pretty straight-forward.

My starting point, which base32-decoded and decrypted some placeholder text, looked like this:

#include <string.h>
#include <stdio.h>

char str[] = "43(:7&!#3&@>$%^|]:&6;<7*-$}9{;!*!$5{<;-=^8]#:5<@#%#&!5!@40207#9($3&)7<$1";

int b32dec(char *in, char *out, int len)
    int i;
    char alpha[] = "3:5[9&2^]{7}*<-8@=4(6#!>|)0+;1$%";
    for (i=0; i < len; i++)
        unsigned char bits = strchr(alpha, in[i])-alpha;
        int j = (i / 8) * 5;
        switch (i % 8) {
            case 0:
                out[j + 0] |= bits << 3;
            case 1:
                out[j + 0] |= bits >> 2;
                out[j + 1] |= bits << 6;
            case 2:
                out[j + 1] |= bits << 1;
            case 3:
                out[j + 1] |= bits >> 4;
                out[j + 2] |= bits << 4;
            case 4:
                out[j + 2] |= bits >> 1;
                out[j + 3] |= bits << 7;
            case 5:
                out[j + 3] |= bits << 2;
            case 6:
                out[j + 3] |= bits >> 3;
                out[j + 4] |= bits << 5;
            case 7:
                out[j + 4] |= bits;
    return i/8 * 5;

int keysched(unsigned char *x, unsigned char *key, int keylen)
    int i;
    unsigned char tmp, a = 0;

    for (i=0; i < 256; i++)
        x[i] = i;

    for (i=0; i < 256; i++)
        a += x[i] + key[i % keylen];
        tmp = x[i];
        x[i] = x[a];
        x[a] = tmp;

int crypt(unsigned char *x, unsigned char *y, int len)
    unsigned char a;
    unsigned char b = 0;
    unsigned char tmp;
    int i;

    for (i=0; i < len; i++)
        a = i+1;
        b += x[a];
        tmp = x[a];
        x[a] = x[b];
        x[b] = tmp;
        y[i] ^= x[(x[a] + x[b]) & 0xff];

int main()
    unsigned char x[256];
    unsigned char key[] = "abcd";
    unsigned char crypt_text[sizeof(str)] = {};
    int len;

    len = b32dec(str, crypt_text, strlen(str));
    keysched(x, key, strlen(key));
    crypt(x, crypt_text, len);
    printf("%sn", crypt_text);

Micro-optimizing for source code size is unusual, and in some ways backwards to optimizing for generated code. For example, this change saved a couple of characters, but in the opposite way frequently done:

-            *o++ = d[0] << 3 | d[1] >> 2;
+            *o++ = d[0] * 8 | d[1] / 4;

Similarly, I found that combining unrelated functions was useful in eliminating the character waste of function definitions. There are likely a good deal more space-savers to be found; I quit when I got it small enough.

I experimented with a few different formats for the code. It turns out that I'm not terribly good at drawing ASCII. I abandoned a baby bottle as unrecognizable and went with a footprint after seeing this motif on some birth announcements. I hand-drew it, scanned it in, loaded as a background in an html document and then put dollar signs in the right places. Yes, cheating, but I am no artist.

                              $$$      $$$$$$$$$
                        $$   $$$$$    $$$$$$$$$
                       $$$$   $$$     $$$$$$$$$
                       $$$$             $$$$$$
                  $$   $$
                 $$$$            $$$$$$
             $  $$$$$       $$$$$$$$$$$$$
           $$$$  $$$$    $$$$$$$$$$$$$$$$$
           $$$$       $$$$$$$$$$$$$$$$$$$$$
           $$$$     $$$$$$$$$$$$$$$$$$$$$$$
            $$    $$$$$$$$$$$$$$$$$$$$$$$$$

I wrote a python script to encrypt and encode the string to be embedded in the code. A second script tokenized the original C source and replaced dollar signs from the text template with code. The latter script is not perfect, but worked well enough that I could hand-edit the output in a few seconds to get something reasonable.

The gory details are available at github.

Finally, on delivery day, I discovered that WordPress mangled C code included in posts. So I simply took a screenshot of the output and posted that instead, with a link to the original text. A picture of the newborn was hidden as a hyperlink from the footprint image.

Overall, the entire process took a couple of evenings from concept to completion, and I'm quite happy with the way it turned out. And, of course, I'm ecstatic about the inspiration: our newly arrived baby boy. We are very fortunate to have both Alex and now Ian in our lives, and silly C programs and fake man pages cannot come close to capturing our joy.

New syscall proposal

CLONE3(2)                   System Programmer's Manual                 CLONE3(2)

       clone3 - create a child process

       #include <clone.h>

       int clone3(pid_t parent, pid_t co_parent,
                 int (*fn)(void *), void *child_stack,
                 int flags, void *arg, ... );

       clone3()  creates  a new process, in a manner similar to fork(2).

       Unlike fork(2), these calls allow the child process to share  parts  of
       its  execution  context  with  the  parent and co-parent processes,
       such as their home, food, and approximately half of their genetic

       The main use of clone3() is to implement offspring: sentient beings
       that run concurrently with the parents in a shared environment.

       When  the  child  process  is  created  with  clone3(),  it executes the
       function  application  fn(arg).   (This  differs  from  fork(2),  where
       execution  continues  in the child from the point of the fork(2) call.)
       The fn argument is a pointer to a function that is called by the  child
       process  at the beginning of its execution.  The arg argument is passed
       to the fn function.  Although the parent processes may supply a suitable
       fn function, the child process has a mind of its own and may deviate
       from its execution at will.

       Executing clone3 will randomly select half of the genes from each
       parent, resulting in a selection of phenotypes according to
       Mendelian inheritance.  By design this process increases the
       number of bits in the global entropy pool, making recovery of the
       random state impractical.

       The  child_stack  argument  specifies the location of the pile of
       diapers used by the child process.  After the child process leaves
       the EMBRYO state and until it acquires the CAP_POTTY capability,
       it may frequently dump core.  The calling process must therefore
       set up a large store of waste disposal units and pass a pointer to
       this store to clone3().  After a successful core dump it is necessary
       for a parent to handle the condition as soon as possible.

       The flags parameter may  be  bitwise-or'ed  with zero or more of the
       following constants, in order to specify  the behavior of the system

              Setting CLONE_XY indicates to the system that a child of the
              male gender is requested.  This is only a hint to the system
              and may not be honored.

              Setting CLONE_XX indicates to the system that a child of the
              female gender is requested.  This is only a hint to the system
              and may not be honored.

              If both CLONE_XY and CLONE_XX are set, results are undefined.

       On  completion,  the  thread  ID  of  the child process is returned in
       both parents' thread of execution.  During execution, the child process
       may signal an error condition by raising any of the following signals.

       HUNGRY  The stomach monitor process has detected an empty condition.
               The parent may correct the condition by supplying food.

       TIRED   The process has executed too many cycles and must enter

       GRUMPY  The process is raising signals at full volume for unknown
               reasons.  The parent process should attach pacifier(2) to
               the child process and attempt to cause it to enter sleep(3).

       HAPPY   The process is in a suitable state for taking pictures for

       Biology 101.

       Child process may spit-up on your new shirt; a burping cloth is

       The following psuedo-code illustrates the intended usage:

       #include <clone.h>

       int main(int argc, char *argv[])
            pid_t ange = /* ... */
            pid_t bob =  /* ... */
            pid_t alex;

            alex = clone3(ange, bob, fn, stack, CLONE_XY, NULL);
            sleep(86400 * 30 * 9);


       This yields the following:

       Alexander Yit-Keung Copeland
       Born 26 June, 2010
       Weight 7 lbs 12 oz

       fork(2),    futex(2),    getpid(2),    gettid(2),   set_thread_area(2),
       set_tid_address(2),  tkill(2),  unshare(2),  wait(2),  capabilities(7),

CLONE3(2)                         2010-06-26                          CLONE3(2)