It’s faster because it’s async!

Some Javascript developers, when presented with an API, think, “I know, I’ll make it async!” Now their users have a giant mess of a state machine implemented with callbacks and recursion, in a GC runtime that doesn’t eliminate tail calls, and two orders of magnitude worse performance. Normally, I wouldn’t care, because you get what you ask for when you write something in Javascript, but this time it happened to a project that I occasionally maintain for $work. And so I was sad.

So, far from expert in the ways of JS, I looked for a way out of this mire, and stumbled across task.js, which is what the kids are doing these days. It is a neat little hack, although it only works if your JS engine supports generators (most do not). Still, it seemed like a reasonable approach for my problem, so I decided to understand how task.js works by deriving my own.

I should note that if one’s JS engine doesn’t have generators, one can kind of emulate them, in the same sense that one can emulate goto with switch statements. Consider this simple example:


var square_gen = function() {
    var obj = {
        state: 0,

        next: function() {
            var ret = this.state * this.state;
            this.state++;
            return ret;
        },

        send: function(v) {
            this.state = v;
            return this.next();
        },
    };
    return obj;
};

var gen = square_gen();
for (var i=0; i < 4; i++) {
    console.log("The next square is: " + gen.next());
}
/*
Outputs:
The next square is: 0
The next square is: 1
The next square is: 4
The next square is: 9
*/

In other words, square_gen() returns an object that has a next() method which returns the next value, based on some internal state. The next() method could be a full-fledged state machine instead of a simple variable.

Certainly this is less tidy than the equivalent with language support:


var square_gen = function*() {
    var state = 0;
    while (1) {
        yield state * state;
        state++;
    }
};

(I'll assume the existence of generators in examples henceforth.)

The send() function is interesting -- it allows you to set the state from which the generator continues.


square_gen.send(16); /* => 256 */

The key idea is that generators will only do one iteration of a possibly large computation. When yield is reached, the function exits, and the computation is started up again after the yield statement when the caller performs another next() or a send().

Now, suppose you need to do something that takes a while, without waiting, like fetching a resource over the network. When the resource is available, you need to continue with that data. In the meantime you should try to do something else, like allow the UI to render itself.

The normal way one does this is by writing one's code in the (horrible) continuation-passing-style, which is fancy talk for passing callbacks to all functions. The task.js way is better: write functions that can block (generators) and switch to other tasks when they do block. No, I'm not making this up -- it is normal for a program in JS land to include a half-baked cooperative multitasker.

You can turn callbacks into promises, as in "I promise to give you value X when Y is done." Promise objects are getting baked into various runtimes, but rolling your own is also easy:


function MyPromise() {
    this.todo = function(){};
}

MyPromise.prototype.then = function(f) {
    this.todo = f;
};

MyPromise.prototype.resolve = function(value) {
    this.todo(value);
};

That is: register a callback with then(), to be called when a value is known. The value is set by calling resolve().

Now, suppose you have an async library function that takes a callback:


function example(call_when_done) {
    /* do something asynchronously */
    call_when_done(value);
}

Instead of calling it like this:


    example(function(value) { /* my continuation */ });

...you can give it a callback that resolves a promise, which, when resolved, calls your continuation:


    var p = new MyPromise();
    example(function(value) { p.resolve(value); });
    p.then(function(value) { /* my continuation */ });

This is just obfuscation so far. The usefulness comes when you rewrite your functions to return promises, so that:


    function my_fn(params, callback) { /* pyramid of doom */ }

becomes:


    function my_fn(params) { /* ... */ return promise; }

Now you can have generators yield promises, which will make them exit until another next() or send() call. And then you can arrange to call send() with the promise's value when it is resolved, which gives you blocking functionality.


    /* An async function that returns a promise */
    function async() {
        var p = new MyPromise();
        example(function(v) { p.resolve(v); });
        return p;
    }

    /* A generator which blocks */
    var gen = function*() {

        /* does some async call and receives a promise */
        var p = async();

        /* blocks until async call is done */
        var v = yield p;

        /* now free to do something with v... */
    }();

    /*
     * Run one iteration of the generator.  If the
     * generator yields a promise, set up another
     * iteration when that promise is resolved.
     */
    function iterate(gen, state) {
        var p = gen.send(state);
        p.then(function(v) {
            iterate(gen, v);
        });
    }

    /* Start a task */
    var p = gen.next();
    p.then(function(v) {
        iterate(gen, v);
    });

    /*
     * You can do something else here, like start another
     * generator.  However, control needs to eventually
     * return to runtime main loop so that async() can
     * make progress.
     */

A lot of details are omitted here like StopIteration and error handling, but that's the gist. Task.js generalizes a lot of this and has a decent interface, so, do that.

In summary, Javascript is terrible (only slightly less so without c-p-s), and I'm glad I only have to use it occasionally.

AWS precursor

Let’s say it’s 1975, and you have a mountain of data (2 megs) to process, and a heap of cash. Whom do you call to run your Cobol? Your local airline, of course!

Excerpt:

IBM 360/195 for only 50 cents a SECOND

Guaranteed Turnaround! 2meg; 2314’s – 3330’s – 3420’s
OS/MVT
HASP/RJE
MPSX-GPSS-PMS-SSP-CSMP
Ans Cobol, Fortran G, G1, H, Assembler F & H, PL/1 F and PL/1 Optimizing and Checkout Compilers.

Our typical customer is knowledgeable in OS; has good working knowledge of JCL, Utilities and the functions of the compilers/assemblers he uses.
[…]
Call or Write
UNITED AIRLINES

Courtesy of a yellowed copy of Computerworld my dad found among his things. I noticed Google has scanned a lot of old issues but I couldn’t find this one in their archive.

No, I don’t remember those days.

Grepping 300 gigs

It’s fun when a problem is simple, yet the program to solve it takes long enough to run that you can write a second, or third, much faster version while waiting on the first to complete. Today I had a 1.8 billion line, 300 GB sorted input file of the form “keytvalue”, and another file of 20k interesting keys. I wanted lines matching the keys sent to a third file, where the same key might appear multiple times in the input.

What doesn’t work: any version of grep -f, which I believe is something like O(N*M).

What could have worked: a one-off program to load the keys into a hash, and output the data in a single pass. Actually, I had written this one-off program before for a different task. In this case, the input file doesn’t need to be sorted, but the key set must be small and well-defined. It is O(N+M).

What worked: a series of sorted grep invocations (sgrep, but not the one in ubuntu). That is O(lgN * M), which can be a win if M is small compared to N. See also comm(1) and join(1), but I don’t know off the top of my head whether they start with a binary search. I had a fairly low selectivity in my input set so the binary search helps immensely.

What also could have worked: a streaming map-reduce or gnu parallel job (the input file was map-reduce output, but my cluster was busy at the time I was processing this, and I am lazy). That would still be O(N+M), but distributing across P machines in the cluster would reduce it by around the factor P. P would need to be large here to be competitive with the previous solution from a complexity standpoint.

Making up numbers, these take about 0, 5, 1, and 20 minutes to write, respectively.

Of course, algorithmic complexity isn’t the whole picture. The last option, or otherwise splitting the input/output across disks on a single machine, would have gone a long way to address this problem:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda              10.50     0.00  361.00    9.75 83682.00  4992.00   478.35   139.42  174.20    6.43 6385.95   2.70 100.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

You can’t go any faster when blocked on I/O.

Note the similarities with join algorithms in databases — variations on nested loop, hash, and sort-merge joins are all present above.

Cooking with gnuplot

Over the winter holidays I was put in charge of cooking one (of several) of the family dinners. At my house, a Christmas dinner can mean only one thing: prime rib is on the menu. The local grocery store had a great deal on rib roasts, so I bought a whole one. All seven ribs, 25 pounds of it. When it came time to cook this beast, I did plenty of reading, and settled on this seriouseats recipe. I guessed at about six hours to slow-roast the behemoth. But after a few hours of roasting, I decided it would be nice to know whether it would finish in time for the guests, or whether we would have to invent some pre-meal activities to stall.

Linear regression to the rescue! I had a leave-in meat thermometer plugged into the slab of cow, the type with a cable that runs outside the oven so that you can read the temperature without opening the oven door. It was then a simple matter to record the temperature every fifteen minutes and plot it to see how it was going. My uninformed guess was that the temperature curve is really sigmoid-shaped, but linearity is probably close enough around the target range.

Gnuplot can do linear regression for you:

f(x) = a*x + b
fit f(x) 'temp.dat' u 1:2 via a, b
set xrange [-5:160]
plot 'temp.dat' u 1:2 w linespoints, f(x)

This produces a graph like the image below, which shows that after 3 hours of cooking, the meat would be around 128 degrees (I started keeping track about three hours in).

In the end, I turned up the oven a bit in the last hour to speed things along. The meat turned out great, although I didn’t have too much luck with the in-between rest that the recipe promotes: there were still plenty of juices all over the place at carving time. Next time, I believe I’ll just turn the oven up to 500 deg. F when the interior target temperature is reached, and then do a normal rest afterwards. Another lesson learned: a full rib roast is perhaps twice as much as needed for eight people, but I am not one to complain about having prime rib leftovers for a week.

Mavened.

So, I’m applying a one line patch to a Java package, and oh yes, it needs maven. Hooray!

$ apt-cache depends --recurse maven | grep Depends | 
     grep -v '<' | egrep lib.*java | sort | uniq | wc -l
203
[... start build and watch maven download another pile ...]
$ find ~/.m2 -name *pom | wc -l
851

I find it hard to believe that there are that many useful libraries in the world.

Olio

I neglected to write up anything on the blog in November despite it being the penultimate of months, so here’s a meandering catch-up post to atone. My apologies for the gratuitous self-linking that is about to ensue.

On Halloween: our 2-year old went as Kung Fu Panda (his favorite movie, and yes, shame on his parents for letting him watch movies) for Halloween this year. He was quite excited to learn that you can just go ask for candy from strangers and they will give it to you. He has mastered enough language to say “Pumpkin Scary,” which he did given every opportunity on seeing the skull I had carved on the pumpkin on the right. The other pumpkin is supposed to be McQueen from the Disney Cars property; Alex called it “Pumpkin Car.” They are now composting. So it goes.

On Thanksgiving: being a half-US, half-Canadian family, we get to celebrate both Thanksgivings: the fake one and the real one. And so we did. It was great spending time with the family and meeting up with some old friends in Atlanta, though we learned painful lessons about air travel with young children.

On Canada: I’ve now been in Canada for a little over a year. Among the observations I had detailed previously, I can now add these:

  • We do have a few days here in the summer that qualify as hot, but no, it does not get Georgia-in-August-hot.
  • Canadian Coca-cola is superior to non-KFP USA Coke, and despite what the PR machine in Atlanta will tell you, you can easily tell the difference between sugar and HFCS when you’re used to one or the other.
  • Canadian TV is even more a wasteland when there is no hockey.

On Bash Goto: per my last post, I considered how hard it would be to write an x86 emulator in bash. Conclusion: despite the potential good fun in simulating %eip using ‘nl’ and sed, I’ll leave this task to someone else. However, I did improve my actual implementation of this somewhat. One easy win is to put the jump labels inside comments so that an ordinary run of the script won’t barf. And so that is what I did.

On Work: while I’ve been a contractor in name for the last year, I have now taken on some other contracts and thereby made this status more official. It was a tough decision to go this route versus, say, working on a salaried basis with some large, hypothetical mobile chipmaker with a Canadian presence, but so far I am happy with the choice. Most recently, I have been doing some Linux mesh networking stuff with Cozybit. It may be a while before any of it finds its way upstream (there are NDAs involved) but I claim that it is cool stuff. In the meantime, I get to continue slaying big data dragons at LP/Xmarks.

On HBase: speaking of HBase, two things have recently come to my attention. First, there is Hannibal, a cluster monitoring tool which was inspired by my post about beating gnuplot over the head with perl. I had nothing to do with its implementation otherwise, but it looks pretty cool. Secondly, I recently had an enquiry about my cache-oblivious code from some HBase folks. I’m not working on that either, but I am hopeful that something comes of it since it would be great if these ideas (not my own) percolate out into mainstream practice.

goto in bash

If you are as old and nerdy as I, you may have spent your grade school days hacking in the BASIC computer language. One of the (mostly hated) features of the (mostly hated) language was that any statement required a line number; this provided both the ability to edit individual lines of the program without a screen editor, as well as de facto labels for the (mostly hated) GOTO and GOSUB commands. But you could also use line numbers to run your program starting from any random point: “RUN 250” might start in the middle of a program, typically after line 250 exited with some syntax error and was subsequently fixed.

Today, in bash, we have no such facility. Why on earth would anyone want it, with the presence of actual flow control constructs? Who knows, but asking Google about “bash goto” shows that I am not the first.

For my part, at $work, I have a particular script which takes several days to run, each part of which may take many hours, and, due to moon phases, may fail haphazardly. If a command fails, the state up to that point is preserved, so I just need to continue where that left off. Each major part of the job is already factored into individual scripts, so I could cut-and-paste commands from the failure point onward, but I’m lazy.

Thus, I present bash goto. It runs sed on itself to strip out any parts of the script that shouldn’t run, and then evals it all. Prepare to cringe.


#!/bin/bash
# include this boilerplate
function jumpto
{
    label=$1
    cmd=$(sed -n "/$label:/{:a;n;p;ba};" $0 | grep -v ':$')
    eval "$cmd"
    exit
}

start=${1:-"start"}

jumpto $start

start:
# your script goes here...
x=100
jumpto foo

mid:
x=101
echo "This is not printed!"

foo:
x=${x:-10}
echo x is $x

results in:


$ ./test.sh
x is 100
$ ./test.sh foo
x is 10
$ ./test.sh mid
This is not printed!
x is 101

My quest to make bash look like assembly language draws ever nearer to completion.

Update 2019/05/21: A reader pointed out that executing one of the labels results in bash complaining “command not found” and suggested putting the labels in a comment, which works just fine without any other changes (but if you like, you can drop the “grep -v” in that case). You might also be interested in the takes of folks on reddit, or my own take of being discussed on reddit.

Update 2023/08/04: Greetings, and my apologies, to Hacker News readers all these years later. Hi!

Amped

Some progress has been made on my guitar chip amp. Readers may remember that at the time of my last post, I had tinkered with KiCad and created the basic layout for my PCB. Since then, I finalized the design and had it fabbed through the great OSH Park batch prototyping service. This is a great deal: it wound up being about $10 a board (you get three at $5/sq. in) and the boards are of excellent quality. It took about 3 weeks from time of design submission for the finished boards to arrive in Canada from the States.


Practice amp PCB - topPractice amp PCB - bottom

Above are the top and bottom of one the unpopulated boards. The solder mask is the OSH Park signature purple, and the through-holes are gold-plated. The bottom side silkscreen says “This silkscreen intentionally left blank” — a last minute addition because the service’s design checks fail if any layers are missing. I should get more creative with that next time.


Speaker box practice amp

Although I do plan to someday build a wooden speaker cabinet to house everything, for now I just shoved all the electronics in the box that the speaker came in, as it happens to be the right size. Plug the instrument cable in the front, flip on the illuminated switch, and one is ready to rock.

I had all the parts waiting for the boards, so when they arrived I quickly populated one and hooked it up, plugged in my Les Paul and strummed an E chord. The resulting sound was something short of musical: loud, ringing, distorted noises were heard instead of the clear tones of my prototype. I feared that my board design was flawed, and without a scope handy, it would be tough to track down. But on the plus side, I knew that I was about to learn something.

I spent an hour or so following all of the traces on the board and comparing them to the schematic, making sure the caps were all connected the right way around and so forth. I also compared the finished board to my protoboard, which happened to work fine. Everything was the same, except where I had used two 1 ohm resistors in series to invent a 2 ohm resistor (the datasheet called for a 2.2 which I didn’t have at prototyping time).

Then I looked at said resistors a little more closely. On the outside edge, I could just make out a faint yellow band where I previously thought there was none. And that brown band was just the tiniest bit in the purple spectrum. Oops! My 1 ohm resistors were actually 47 ohms, making the gain of my prototype a measly three instead of the 101 I thought it was. It turns out that a gain of 101 is way too high without a volume knob. Also, one of the 47-ohm resistors was in a stabilizing RC circuit, likely causing the ringing oscillations I had heard in my PCB version.


Practice amp populated

I fixed the RC circuit when building board number two, but stuck with the 2.2 ohm resistor and accompanying large gain. I might use this version if I decide to put a volume knob in front of the amplifier, but I do find it’s a little too noisy overall for my tastes. I thus went back to board number one, desoldered the 47 ohm and 2.2 ohm resistors and replaced them with 1 ohm and 22 ohm resistors respectively, lowering the gain of that board to a modest eleven.


Practice amp prototypePractice amp PCB guts

Pictured above are the before and after of prototype and PCB versions. There are certainly a few things I would change if I ever make another revision of the PCB. For one, the annular rings on the heat sink and connector footprints were too small, so soldering them was difficult with so little of the pad exposed. I might also do away with or reduce the size of the ground plane; even with the thermal reliefs, soldering and desoldering the ground connections was no easy task. But on the whole, making the PCB was an interesting experience and I look forward to having another excuse to do so again.

I’ll put the design files up on github one of these days.

KiCad Level 0 Achieved

When we moved to Canada, I sold any guitar gear that wouldn’t fit in a backpack. This has left me without the ability to plug in my electric guitar for the last ten months, a situation that must be rectified (ha, ha). I just need a small practice amp, and those can be found on an IC these days. One needs only to add a little soldering, attach a speaker, repeat as necessary for whatever mistakes one makes, and ta-da: a small custom amp that one could have bought mass-produced from China for 1/50th of the price of building one’s own. But of course there is value in the making, or so we tell ourselves.

Anyway, I’ve built my circuit (more or less the same one as on the op-amp’s datasheet) on a breadboard and it sounds fine except for the complete lack of shielding. Since it’s now possible to get one-off PCBs without spending a fortune, and using a real board is more fun than protoboards, let’s do that!

Back in the dark ages, I used to work on a PCB software that sold for $30k/license (I thought that was ridiculous then too). While I seem to have stuffed some arcane knowledge about keepouts and autorouters somewhere in the back of my brain, most of the PCB black magic did not rub off on me. After a 15-year hiatus from that domain, I can firmly label myself “PCB novice.”

Eagle Light (free as in beer) is the choice of many a hobbyist. I have poked at Eagle before, but never liked it enough to do anything substantive. These days, getting it to run at all means finding library versions that aren’t used in any current Linux distribution, building them, doing some LD_LIBRARY hacks, setting up a 32-bit runtime, and so on. Yes, I also did that this weekend. Eagle has a nice big standard library, but I feel learning it at all is a bit of a dead end when there are usable, truly free alternatives. And KiCad looks like it just may fit the bill. I apt-get installed it and was off to the races.

Overall, KiCad is a nice piece of FOSS software that can easily do whatever one would do in Eagle. Unfortunately, it still reminds me of what I hate about PCB software in general. I find designing a PCB to already be extremely fiddly and unfun; as a non-expert, I just want to build a schematic and then place and route things with sensible defaults. Instead, what I wound up doing was learning to use the symbol editor, and the module editor, gathering various facts about drill and pad sizes, and picking up KiCad’s various quirks such as how you have to constantly reload newly created libraries, and how the UI frequently makes no sense. Of course, UIs never make any sense in EDA tools, so you sometimes just have to go with it. Doing librarian work is ultra-boring, especially for tiny boards like this.

Still, in the course of a few hours, I built the basic schematic (including adding three new symbols), passed ERCs, created three missing footprints, and did an initial routing of the board. Not too bad.

In the last decade component libraries have grown 3-D models, which is kind of neat. On the one hand, it’s not terribly useful since the canned components are unlikely to closely match reality, and editing 3-D models of components is even less interesting to me than editing their footprint. But, it does make for decent screenshots like the one above.