Wrapping up

I just submitted my application for graduation here at Hopkins, so as my grad_school bit settles over the next couple of months, I must now contemplate what to do next. I haven’t given it a whole lot of thought, except that I will be glad to be out of academics for the rest of eternity. I’ll save an account of my experience here for another post, but it has been generally positive. Anyway, my dance card is far from empty, but while I’m in limbo between real work and finishing school, and deciding where our family will live, I might as well be promiscuous and see what’s out there.

I guess if I had to enumerate my occupational preferences, they would go something like this:

  • Unix kernel hacker or driver developer. Between school and the new parental status, time for my ath5k / karma / android hobby has dwindled to a trickle. I hate to see my x86 assembly skills go to waste. Looking at you, Ginormous Linux Company.
  • Systems work on problems of huge scale. This has been a recent theme of my studies, and there are some fascinating problems to tackle.
  • Systems work on small scales. Embedded work is just a lot more fun than desktops.
  • Any cool project with smart coworkers
  • Open source
  • Usual niceties like close to home, close to child care, health plan, big bag of cash…

Minus points for: UI work, remote object technologies, anything with angle brackets, function names with too many capital letters, centralized version control, marginally reinvented functional languages

PageRank in Matlab

For a school project, I’ve been reading lots of papers on good old PageRank. We’re doing something similar to this venerable algorithm in a streaming context. For a basis of comparison, I needed a simple, reliable implementation of the classic algorithm, and Matlab (octave) happens to excel at this kind of thing. So here’s my version. Initialized appropriately, this gives the same results as the diagram on the appropriate Wikipedia page. You probably wouldn’t want to run this on the web graph though: it wants to be run in a cluster at that scale.


function[R] = pagerank(A, R, d, epsilon)
N=size(A)(1);
E=ones(N);
R_last = R;
delta=epsilon;
while (delta >= epsilon)
    R = (d * A + (1-d)/N * E) * R_last;
    dist=R-R_last;
    delta = sqrt(dot(dist,dist));
    R_last = R;
end

R is the initial ranking vector. Typically this is just 1/N for each entry, but it can be some other probability distribution to arrive at personalized rankings. A is the column-oriented adjacency matrix divided by out-degree (with a plain adjacency matrix T, K=diag(sum(T)); A=K^-1 * T). Nodes with out-degree zero should be marked adjacent to everyone. d is the dampening factor, and epsilon is the convergence threshold.

I’m pretty sure my Matlab coding skills are quite lackluster, but it works.

I’d like the shed in blue

In addition to the idea that the least interesting and most trivial patches get the most scrutiny, I think the bike-shedding effect is doubled if someone claims their implementation is extra clever. Then Linus unleashes his Thunderbolt of Clarity and sanity is restored.

To exploit this phenomenon, one seeking reviews should claim that one’s code is better than any Knuth himself could have dreamed up, uses the minimal number of clock cycles on any present or future architecture, and solves Clique in P. Then nerds will come at you like zombies.

log2 in bash

Turns out bash supports enough arithmetic operators to implement lg() just as you might in some real language:


function log2 {
    local x=0
    for (( y=$1-1 ; $y > 0; y >>= 1 )) ; do
        let x=$x+1
    done
    echo $x
}

z=$(log2 64)


I had a somewhat valid reason to want to do this. The one thing that continually gets me when I try to do dumb things like this is when to use the raw variable name and when to use the dollar sign. Yes, the semantics are well-defined, but my perl habits run deep. Similarly, in ruby, all my variables are global.

Is there a patch for bureaucracy?

So I’ve now had experience fixing other peoples’ software in several different domains. Here’s how it plays out:

Open-source project:

Thanks! Applied.

Proprietary application:

Thank you for your submission. We have taken your request under consideration, and assigned it ticket #17231. In our review for the next release, scheduled to be released in Q4 next year, we will consider each resolved ticket for possible inclusion in our product.

Government project:
[paraphrasing an angry letter to division heads]

Please find attached a 9-page document describing why we cannot include any changes in our software, even 80-line changes that fix crashes such as the ones your agency has experienced. We suggest your agency maintain your own copy of the software with the change applied.

P.S. We doubt seriously it’s really our problem even though you gave us an explanation and a reasonable code fix. Please waste some more time trying to prove it to us.

JNIs and ABIs

For a recent task at work, I had a compelling reason to break out of the Java jail and do some “real programming” in C. Unfortunately, said C-language bits were to be run on the Windows platform. I had a stripped DLL (no source) for manipulating USB cameras and an API document to work with, and a plan to duct tape this all together with JNI.

I’m not about to touch MSVC if I can help it, so I grabbed mingw packages for my platform and wrote up a small wrapper DLL to do the necessary JNI bridge. Here’s my attempt to archive the working recipe for the Google bots.

Compiling this wrapper library was quite easy with mingw:

$ cat Makefile
# This makefile builds usbcamjni.dll to wrap usbcam.dll
# Use mingw32-make to build
CPPFLAGS=-I$(JAVA_HOME)/include -I$(JAVA_HOME)/include/linux
LIBS=-L . -l usbcam

usbcamjni.dll: usbcamjni.o
        $(CC) -shared -o $@ $< $(LIBS)

clean: .FORCE
        $(RM) usbcamjni.dll usbcamjni.o

.FORCE:

$ mingw32-make

It turns out on Windows there are a few different calling conventions that you can choose from at compile time. Garden-variety code uses __cdecl, which is how the world should operate. The symbols are named just as they are named in the C code with a leading underscore, and the caller is responsible for cleaning up his own stack mess.

In the Windows world, there is __stdcall, where the symbol names (by default) have trailing decorations for the size of parameters on the stack (e.g. _write@4) and the callee cleans up the stack. The benefit is somewhat smaller code, but you can't have variadic functions.

So I was testing my JNI wrapper and getting JVM crashes. Looking at the register dump showed that the program counter was not pointing to a normal text virtual address, which is symptomatic of stack confusion. Once I re-acquainted myself with the above arcana, I guessed correctly that I was using the wrong calling convention for the DLL: together the function epilogue and caller were popping off too many levels of stack. However, in this case the DLL was using undecorated symbol names (e.g. _write) but still expecting __stdcall semantics. As it happens, this is how Win32 API works.

The problem is that annotating the prototypes in header files with "__stdcall" to get correct calling conventions also makes the dependent object file expect the decorated function name. Linking against the DLL results in lots of unsatisfied symbol errors.

The solution took some digging and doesn't make a whole lot of sense to me, but the way out of this using mingw tools is to build an interface library. First, you create a def file like so:

LIBRARY MYLIB.DLL
EXPORTS
Symbol1@0
Symbol2@4
Symbol3@8

Note that symbols in this file include the decorations. Then you run dlltool from mingw with -k to remove the decorations from the generated library. As I understand it, this option actually creates aliases from the decorated to undecorated name so that subsequent linking can work. The manpage says "-A" is actually for this, but before I read it, I found a mailing list post advocating "-k", and, well, it worked, so I'm joining that particular cargo cult.

$ dlltool -k -d mylib.def -l mylib.a

Then when building the JNI library wrapper, link against mylib.a instead of the original DLL. Easyish!

Kernel updates

During baby naps I actually did some Rio Karma work this week. First, I found and fixed a recent bug in usb storage that makes the Karma not work at all in 2.6.35. Then I finally set up my omfs git tree on kernel.org, with a few patches I wrote almost two years ago, plus a recently found memory leak. I’ll try to get Linus to pull it in 2.6.36. Then, I made a new release of omfsprogs rewritten to use libomfs from the FUSE version of omfs (I also did this work two years ago). Now all is right in the Karma world again; time to ignore it for another 6 months.

Then, on to ath5k. One nagging issue that the driver has always had is that there’s pretty much a total lack of synchronization among the myriad threads. There’s an ad-hoc spin lock here and there, but for example the reset procedure, which reloads hundreds of registers, can be called concurrently with itself and other driver ops, and can be invoked while trying to transmit and receive packets. So one idea we’ve been hashing out on the ML is to serialize reset with a mutex by putting it into process context via a workqueue, and disabling tasklets while it runs. I hacked that together and it’s now available in wireless-testing. Hopefully this will fix some of those weird DMA-into-freed-skb errors that can potentially be caused by untimely modification of descriptor pointers.

Finally: tracing. For the last year or so, Linux has grown its own NIH answer to DTrace in the form of ftrace and tracepoints. The wireless subsystem already has a few tracepoints here and there, and I really like it better than debug printks because they have smaller overhead in the compiled-but-not-enabled case (a branch compared to an out-of-line function call). You can pick and choose individual tracepoints to sample, or combine them with a general tracing facility such as ftrace. I have some patches baking to add various tracepoints to ath5k [1, 2, 3].

One really neat idea that I appropriated from Johannes Berg’s work on iwlwifi is to store entire packets in the tracing ring buffer. Then you can have a trace-cmd plugin to write these out as tcpdump pcap files. This is really nice: you get a packet dump that works with your existing debug infrastructure, plus you have the tracing information so you know what the code is trying to do, at whatever granularity you chose when setting up the trace. Hopefully, this will be a good one-stop debug gathering tool, not just for developers, but for users, as well.

New syscall proposal

CLONE3(2)                   System Programmer's Manual                 CLONE3(2)



NAME
       clone3 - create a child process

SYNOPSIS
       #include <clone.h>

       int clone3(pid_t parent, pid_t co_parent,
                 int (*fn)(void *), void *child_stack,
                 int flags, void *arg, ... );

DESCRIPTION
       clone3()  creates  a new process, in a manner similar to fork(2).

       Unlike fork(2), these calls allow the child process to share  parts  of
       its  execution  context  with  the  parent and co-parent processes,
       such as their home, food, and approximately half of their genetic
       material.

       The main use of clone3() is to implement offspring: sentient beings
       that run concurrently with the parents in a shared environment.

       When  the  child  process  is  created  with  clone3(),  it executes the
       function  application  fn(arg).   (This  differs  from  fork(2),  where
       execution  continues  in the child from the point of the fork(2) call.)
       The fn argument is a pointer to a function that is called by the  child
       process  at the beginning of its execution.  The arg argument is passed
       to the fn function.  Although the parent processes may supply a suitable
       fn function, the child process has a mind of its own and may deviate
       from its execution at will.

       Executing clone3 will randomly select half of the genes from each
       parent, resulting in a selection of phenotypes according to
       Mendelian inheritance.  By design this process increases the
       number of bits in the global entropy pool, making recovery of the
       random state impractical.

       The  child_stack  argument  specifies the location of the pile of
       diapers used by the child process.  After the child process leaves
       the EMBRYO state and until it acquires the CAP_POTTY capability,
       it may frequently dump core.  The calling process must therefore
       set up a large store of waste disposal units and pass a pointer to
       this store to clone3().  After a successful core dump it is necessary
       for a parent to handle the condition as soon as possible.

       The flags parameter may  be  bitwise-or'ed  with zero or more of the
       following constants, in order to specify  the behavior of the system
       call:

        CLONE_XY
              Setting CLONE_XY indicates to the system that a child of the
              male gender is requested.  This is only a hint to the system
              and may not be honored.

        CLONE_XX
              Setting CLONE_XX indicates to the system that a child of the
              female gender is requested.  This is only a hint to the system
              and may not be honored.

              If both CLONE_XY and CLONE_XX are set, results are undefined.

RETURN VALUE
       On  completion,  the  thread  ID  of  the child process is returned in
       both parents' thread of execution.  During execution, the child process
       may signal an error condition by raising any of the following signals.

SIGNALS
       HUNGRY  The stomach monitor process has detected an empty condition.
               The parent may correct the condition by supplying food.

       TIRED   The process has executed too many cycles and must enter
               sleep(3).

       GRUMPY  The process is raising signals at full volume for unknown
               reasons.  The parent process should attach pacifier(2) to
               the child process and attempt to cause it to enter sleep(3).

       HAPPY   The process is in a suitable state for taking pictures for
               grandparents.

CONFORMING TO
       Biology 101.

BUGS
       Child process may spit-up on your new shirt; a burping cloth is
       recommended.

EXAMPLE
       The following psuedo-code illustrates the intended usage:

       #include <clone.h>

       int main(int argc, char *argv[])
       {
            pid_t ange = /* ... */
            pid_t bob =  /* ... */
            pid_t alex;

            alex = clone3(ange, bob, fn, stack, CLONE_XY, NULL);
            sleep(86400 * 30 * 9);

            take_snapshot(alex);
       }

       This yields the following:

       
       Alexander Yit-Keung Copeland
       Born 26 June, 2010
       Weight 7 lbs 12 oz

SEE ALSO
       fork(2),    futex(2),    getpid(2),    gettid(2),   set_thread_area(2),
       set_tid_address(2),  tkill(2),  unshare(2),  wait(2),  capabilities(7),
       pthreads(7)


CLONE3(2)                         2010-06-26                          CLONE3(2)

Blowtorch Cuisine

Ever mindful of fire safety, I’ve had a full propane canister (the small blowtorch size) rattling around in the back of my trunk for two and a half years. Luckily our car never decided to explode, but just in case, I’ve decided to finally use up the fuel.

Since I plan to never do any sort of plumbing again, the natural application for my blowtorch flame is on food, specifically crème brûlée. So I baked up some custard in my soup bowls, tossed some sugar on top, and burnt that bad boy.

burn

The problem is these torches really only work well upright, so you wind up having to bring the food to the flame rather than the other way around. I did at least have my fire extinguisher at the ready in case a wall decided to ignite. The smoke detector only went off once.

burnt

Our tasters were split on the outcome. Ange really liked it. I thought the custard was a bit too eggy, but that could be the recipe or personal taste. As I’m not a big CB eater, I don’t really have much to compare it to. It looked good, though.

Next up, fire grilled cheese?

Moved

Apologies, dear reader, for not having updates lately: I have been quite busy.

Angeline and I just “completed” moving from one side of the street to the other side (the lie-quotes indicating our plan to live out of boxes for the near future). If you have our old mailing address or home phone number, it’s probably a good idea to drop me an email to get the new data.

The new place seems nice so far except for the water being shut off the day we moved in, and the towing of my car from the space they told us to park in (evidently they were just kidding when they said that).

Here’s the view from our window:


Moving on up

Proof (as if you needed any) that Ange and I are inveterate nerds — 8 of the dozens of boxes crammed full of books:


Books