I can nearly crack it

The GCHQ (that’s British for NSA) has been running a marketing gimmick to find new people to read your tweets. On this website, you will find an enigmatic hexdump, and a prompt for a keyword. Supposedly if you get it correct, then you get forwarded to their job site. I don’t care about the job aspect, but I do enjoy a good puzzle.

I’ll add my details and methodology in a follow-up post after the clock runs out, so as not to spoil anything for casual readers who want to take their own look. There are a few good clues in the hexdump if you have previously spent any time looking at real ones.

I did (almost) crack it: I figured out all there was to know about the hexdump. It became clear at that point that the hexes on the website aren’t everything you need for the puzzle, so I cheated to find the missing piece (somewhat obvious, in retrospect). A neat exercise, and I learned something about cryptography.

Now I’m on the second stage, which is a bit more straightforward in that the problem statement tells you what type of answer is expected.

I’m curious to know if this type of recruitment tool is actually useful in finding qualified applicants, or just in generating buzz. Certainly several companies I’ve worked at have had their share of recruiting woes, and dreaming up a set of screening puzzles has to be more fun than dealing with headhunters.

My Pal SCSI

My son Alex turned one a couple of weeks ago. (If you are reading this, Alex, happy birthday, and congratulations for learning to read at such an early age!) He picked up quite a stash of loot as a result.

Moore’s law means that the processing power of his toy box very likely exceeds that of my first computer by a large factor. As one example, he received a My Pal Scout talking stuffed animal as a gift. A toy that comes with a USB cable — this is progress! You can customize the toy to say and spell your child’s name, and pick different tunes for it to play.

My inner geek has been wondering what’s inside ever since, but of course I cannot take apart or otherwise ruin my kid’s toy in the name of science. I did, however, plug it in to my Linux box while he was napping. A quick dmesg showed the device implements USB storage, but always responds with ‘Medium Not Present’ when accessed. I guessed (incorrectly) that some extra magic might make the internal flash appear as a disk and then files are just copied to a FAT filesystem stored therein. The toy is relatively inexpensive and coming up with too much special sauce is likely to be prohibitively costly.

USB storage is a successful example of taking an existing protocol (SCSI command set) and wholesale wrapping it in a different wire protocol. Each USB storage transfer is initiated by the host sending a Command Block Wrapper (CBW) — a 31-byte USB packet starting with ‘USBC’, typically containing a SCSI command as a payload. Next, a block of data is transferred if this command represents a read or write. Finally, the device completes the transaction by sending a Command Status Wrapper (CSW), a 13-byte packet beginning with the string ‘USBS’.

One can get a feel for the flavor of the protocol by using usbmon. Much like an ethernet sniffer, usbmon provides a simple mechanism under Linux to capture USB traffic. A simple session might look like:

    # cat /sys/kernel/debug/usb/usbmon/4u > usbmon.txt

One might even potentially run usbmon on a host OS while some other OS is running as a guest in a virtual machine with USB pass-through.

The upshot of the layered approach to USB storage is that Linux creates a generic SCSI device (/dev/sgX) for any USB storage device. Using the generic device, one can directly send SCSI commands to the USB device, and the kernel will take care of wrapping it in USB commands. I believe something similar is possible in Windows land.

As it turns out, Scout is even simpler than I imagined. The internal flash has no controller or filesystem; instead it appears to be a raw NAND flash written a page at a time. It is a simple matter to read the flash using the Linux sg device. One merely opens the device file, and then issues a vendor-specific SCSI command on the file descriptor:

static int read_page(int fd, u32 addr)
{
    u8 cmdblk[] = {
        0xfd,               /* access flash */
        0x28,               /* read it (0x20 = write) */
        0, 0, 0, 0,
        0x06, 0, 0x08, 0,   /* no idea what the rest is */
        0, 0, 0, 0,
        0x47, 0x50
    };
    u8 response_buf[4096];
    u8 sense_buffer[32];

    cmdblk[2] = (addr >> 24) & 0xff;
    cmdblk[3] = (addr >> 16) & 0xff;
    cmdblk[4] = (addr >> 8) & 0xff;
    cmdblk[5] = addr & 0xff;

    sg_io_hdr_t io_hdr = {
        .interface_id = 'S',
        .cmd_len = sizeof(cmdblk),
        .mx_sb_len = sizeof(sense_buffer),
        .dxfer_direction = SG_DXFER_FROM_DEV,
        .dxfer_len = sizeof(response_buf),
        .dxferp = response_buf,
        .cmdp = cmdblk,
        .sbp = sense_buffer,
        .timeout = 20000
    };

    ioctl(fd, SG_IO, &io_hdr);
    /* do something with response_buf here */
    return 0;
}

Reading addresses 0x01000 through 0x10000, 4k at a time, seems to yield the customizable data on the device. The flash is tiny: this is just 64k, yet you can upload a digital audio file of your child’s name plus ten songs.

The data format is rather simple: there is a 30-byte header starting at address 0x1000, containing 16-bit, little-endian pointers for the customized files. Address 0x1008 holds a pointer to the spelling of your child’s name, address 0x100e holds a pointer to the audio file pronouncing your child’s name, and so on. Armed with the Leapfrog software and a packet sniffer, one can verify that these files do indeed match the individual binary files that the software downloads over HTTP when syncing the puppy.

I believe the digital audio files are some flavor of raw 8 kHz PCM, but I could not find the right combination of parameters to sox to make sense out of them. The song files are all apparently compressed with TTComp, some compression program from 1995. Running ttdecomp.exe from within dosemu did successfully decompress them. My guess is these files are some sort of sequencer format rather than sampled audio, given their tiny size.

This is, I think, as far as I wish to investigate the toy. Obvious exercises for the interested reader are to discern the individual file formats, and have the toy play Metallica. It’s pretty incredible how much technology can be cheaply packed into a child’s plaything today. But now I have my eye on dissecting that (non-electronic) toy inchworm — is there a spring in there or what?

Fast… for a modem

This question is for someone slightly more motivated: why is qemu-img so slow to convert from a raw disk image to a VMware (vmdk) image? Apparently it does some compression or null-block elimination, as the resulting file is about 30% smaller, but I was not expecting it to take an entire day to convert a 100 gig disk. To get the answer to a first approximation without actually reading the source, I did a quick run with blktrace and seekwatcher:

So we get a nice sequential read pattern, but excessive think time between the reads. Probably, reading bigger blocks would help, or something like posix_fadvise(POSIX_FADV_SEQUENTIAL). Iostat showed the disk was reading exactly 2k sectors (1MB) per second, when it should be able to do 50-100 times that.

I just decided to wait a day rather than learn and hack the source.

Wirth’s Law avenged

Occasionally, rewriting other people’s code can be its own reward:

88 times speedup is not too bad for a few days’ work.

Even C has tsearch

I’m just going on record to state that the two reasons I’ve seen on the net for lack of tree-based collections in the Python standard library — (1) that they have bad locality (gee, very unlike huge memory-hogging randomized hash tables, oh and just try scanning one of those…), and (2) that Python makes it so simple to write your own, just do that! (well, yes I can and have, but hey it’s 2011 already, rbtrees are like 40 years old!) — are inane.

On static in Java

In C, scoping rules aside, the following are pretty much the same:

static int x[] = { /* ... */ };
int foo()
{
    /* do something with x */
}

and

int bar()
{
    static int x[] = { /* ... */ };
    /* do something with x */
}

The static array x will go into the .data section of the executable. When the OS loads the executable into memory, it will ensure that the values for x are set in the appropriate place (e.g. by mapping its disk page to the virtual address in the ELF headers). In the second case, x is given some alias to avoid symbol conflicts.

There seems to be no way to do the second version in Java. Here’s how I tried:

static int xyzzy()
{
    final int x[] = { /* ... */ };
    /* do something with x */
}

This doesn’t do what you might think it does. While x cannot be changed, it is the reference to an array; the array elements themselves can be changed at will. Thus javac is unlikely to do anything smart here.

Indeed, in the above case, javac will initialize x with stack pushes every time xyzzy() is called. I had a real method like this, and making x a static member variable gave me an easy 2x speedup.

C++ error fail

Note to self: double-check the template arguments before asking the compiler to do so.

(WordPress hates to wrap, so I did it, badly.)

mpicxx -Wno-deprecated   -c -o pagerank.o pagerank.cc
In file included from /usr/include/boost/graph/distributed/adjacency_list
.hpp:3959,
                 from pagerank.cc:3:
/usr/include/boost/graph/distributed/adjlist/initialize.hpp: In member fu
nction ‘void boost::adjacency_list<OutEdgeListS, boost::distributedS<Pr
ocessGroup, InVertexListS, InDistribution>, DirectedS, VertexProperty, Ed
geProperty, GraphProperty, EdgeListS>::initialize(EdgeIterator, EdgeItera
tor, EdgePropertyIterator, typename boost::detail::parallel::adjacency_li
st_config<OutEdgeListS, ProcessGroup, InVertexListS, InDistribution, Dire
ctedS, VertexProperty, EdgeProperty, GraphProperty, EdgeListS>::inherited
::vertices_size_type, const typename boost::graph::distributed::select_di
stribution<InDistribution, VertexProperty, typename boost::detail::parall
el::adjacency_list_config<OutEdgeListS, ProcessGroup, InVertexListS, InDi
stribution, DirectedS, VertexProperty, EdgeProperty, GraphProperty, EdgeL
istS>::inherited::vertices_size_type, ProcessGroup, typename boost::graph
::internal_vertex_name<VertexProperty>::type>::type&, boost::vecS) [with
EdgeIterator = boost::graph::metis_reader::edge_iterator, EdgePropertyIte
rator = boost::graph::metis_reader::edge_weight_iterator, OutEdgeListS =
boost::vecS, ProcessGroup = boost::graph::distributed::mpi_process_group,
 InVertexListS = boost::vecS, InDistribution = boost::defaultS, DirectedS
 = boost::directedS, VertexProperty = boost::no_property, EdgeProperty =
boost::no_property, GraphProperty = boost::no_property, EdgeListS = boost
::listS]’:
/usr/include/boost/graph/distributed/adjacency_list.hpp:1644:   instantia
ted from ‘boost::adjacency_list<OutEdgeListS, boost::distributedS<Proce
ssGroup, InVertexListS, InDistribution>, DirectedS, VertexProperty, EdgeP
roperty, GraphProperty, EdgeListS>::adjacency_list(EdgeIterator, EdgeIter
ator, EdgePropertyIterator, typename boost::detail::parallel::adjacency_l
ist_config<OutEdgeListS, ProcessGroup, InVertexListS, InDistribution, Dir
ectedS, VertexProperty, EdgeProperty, GraphProperty, EdgeListS>::inherite
d::vertices_size_type, const ProcessGroup&, const typename boost::graph::
distributed::select_distribution<InDistribution, VertexProperty, typename
 boost::detail::parallel::adjacency_list_config<OutEdgeListS, ProcessGrou
p, InVertexListS, InDistribution, DirectedS, VertexProperty, EdgeProperty
, GraphProperty, EdgeListS>::inherited::vertices_size_type, ProcessGroup,
 typename boost::graph::internal_vertex_name<VertexProperty>::type>::type
&, const GraphProperty&) [with EdgeIterator = boost::graph::metis_reader:
:edge_iterator, EdgePropertyIterator = boost::graph::metis_reader::edge_w
eight_iterator, OutEdgeListS = boost::vecS, ProcessGroup = boost::graph::
distributed::mpi_process_group, InVertexListS = boost::vecS, InDistributi
on = boost::defaultS, DirectedS = boost::directedS, VertexProperty = boos
t::no_property, EdgeProperty = boost::no_property, GraphProperty = boost:
:no_property, EdgeListS = boost::listS]’
pagerank.cc:37:   instantiated from here
/usr/include/boost/graph/distributed/adjlist/initialize.hpp:56: error: no
 matching function for call to ‘add_edge(boost::detail::parallel::globa
l_descriptor<unsigned int>&, boost::detail::parallel::global_descriptor<u
nsigned int>&, const double&, boost::adjacency_list<boost::vecS, boost::d
istributedS<boost::graph::distributed::mpi_process_group, boost::vecS, bo
ost::defaultS>, boost::directedS, boost::no_property, boost::no_property,
 boost::no_property, boost::listS>&)’
/usr/include/boost/graph/distributed/adjacency_list.hpp:2973: note: candi
dates are: typename boost::adjacency_list<OutEdgeListS, boost::distribute
dS<ProcessGroup, InVertexListS, InDistribution>, DirectedS, VertexPropert
y, EdgeProperty, GraphProperty, EdgeListS>::lazy_add_edge_with_property b
oost::add_edge(typename boost::adjacency_list<OutEdgeListS, boost::distri
butedS<ProcessGroup, InVertexListS, InDistribution>, DirectedS, VertexPro
perty, EdgeProperty, GraphProperty, EdgeListS>::vertex_descriptor, typena
me boost::adjacency_list<OutEdgeListS, boost::distributedS<ProcessGroup,
InVertexListS, InDistribution>, DirectedS, VertexProperty, EdgeProperty,
GraphProperty, EdgeListS>::vertex_descriptor, const typename boost::adjac
ency_list<OutEdgeListS, boost::distributedS<ProcessGroup, InVertexListS,
InDistribution>, DirectedS, VertexProperty, EdgeProperty, GraphProperty,
EdgeListS>::edge_property_type&, boost::adjacency_list<OutEdgeListS, boos
t::distributedS<ProcessGroup, InVertexListS, InDistribution>, DirectedS,
VertexProperty, EdgeProperty, GraphProperty, EdgeListS>&) [with OutEdgeLi
stS = boost::vecS, ProcessGroup = boost::graph::distributed::mpi_process_
group, InVertexListS = boost::vecS, InDistribution = boost::defaultS, Dir
ectedS = boost::directedS, VertexProperty = boost::no_property, EdgePrope
rty = boost::no_property, GraphProperty = boost::no_property, EdgeListS =
 boost::listS]
make: *** [pagerank.o] Error 1


Well, obviously I called the wrong ctor, duh!

PageRank in Matlab

For a school project, I’ve been reading lots of papers on good old PageRank. We’re doing something similar to this venerable algorithm in a streaming context. For a basis of comparison, I needed a simple, reliable implementation of the classic algorithm, and Matlab (octave) happens to excel at this kind of thing. So here’s my version. Initialized appropriately, this gives the same results as the diagram on the appropriate Wikipedia page. You probably wouldn’t want to run this on the web graph though: it wants to be run in a cluster at that scale.


function[R] = pagerank(A, R, d, epsilon)
N=size(A)(1);
E=ones(N);
R_last = R;
delta=epsilon;
while (delta >= epsilon)
    R = (d * A + (1-d)/N * E) * R_last;
    dist=R-R_last;
    delta = sqrt(dot(dist,dist));
    R_last = R;
end

R is the initial ranking vector. Typically this is just 1/N for each entry, but it can be some other probability distribution to arrive at personalized rankings. A is the column-oriented adjacency matrix divided by out-degree (with a plain adjacency matrix T, K=diag(sum(T)); A=K^-1 * T). Nodes with out-degree zero should be marked adjacent to everyone. d is the dampening factor, and epsilon is the convergence threshold.

I’m pretty sure my Matlab coding skills are quite lackluster, but it works.

I’d like the shed in blue

In addition to the idea that the least interesting and most trivial patches get the most scrutiny, I think the bike-shedding effect is doubled if someone claims their implementation is extra clever. Then Linus unleashes his Thunderbolt of Clarity and sanity is restored.

To exploit this phenomenon, one seeking reviews should claim that one’s code is better than any Knuth himself could have dreamed up, uses the minimal number of clock cycles on any present or future architecture, and solves Clique in P. Then nerds will come at you like zombies.

Is there a patch for bureaucracy?

So I’ve now had experience fixing other peoples’ software in several different domains. Here’s how it plays out:

Open-source project:

Thanks! Applied.

Proprietary application:

Thank you for your submission. We have taken your request under consideration, and assigned it ticket #17231. In our review for the next release, scheduled to be released in Q4 next year, we will consider each resolved ticket for possible inclusion in our product.

Government project:
[paraphrasing an angry letter to division heads]

Please find attached a 9-page document describing why we cannot include any changes in our software, even 80-line changes that fix crashes such as the ones your agency has experienced. We suggest your agency maintain your own copy of the software with the change applied.

P.S. We doubt seriously it’s really our problem even though you gave us an explanation and a reasonable code fix. Please waste some more time trying to prove it to us.