It’s faster because it’s async!

Some Javascript developers, when presented with an API, think, “I know, I’ll make it async!” Now their users have a giant mess of a state machine implemented with callbacks and recursion, in a GC runtime that doesn’t eliminate tail calls, and two orders of magnitude worse performance. Normally, I wouldn’t care, because you get what you ask for when you write something in Javascript, but this time it happened to a project that I occasionally maintain for $work. And so I was sad.

So, far from expert in the ways of JS, I looked for a way out of this mire, and stumbled across task.js, which is what the kids are doing these days. It is a neat little hack, although it only works if your JS engine supports generators (most do not). Still, it seemed like a reasonable approach for my problem, so I decided to understand how task.js works by deriving my own.

I should note that if one’s JS engine doesn’t have generators, one can kind of emulate them, in the same sense that one can emulate goto with switch statements. Consider this simple example:


var square_gen = function() {
    var obj = {
        state: 0,

        next: function() {
            var ret = this.state * this.state;
            this.state++;
            return ret;
        },

        send: function(v) {
            this.state = v;
            return this.next();
        },
    };
    return obj;
};

var gen = square_gen();
for (var i=0; i < 4; i++) {
    console.log("The next square is: " + gen.next());
}
/*
Outputs:
The next square is: 0
The next square is: 1
The next square is: 4
The next square is: 9
*/

In other words, square_gen() returns an object that has a next() method which returns the next value, based on some internal state. The next() method could be a full-fledged state machine instead of a simple variable.

Certainly this is less tidy than the equivalent with language support:


var square_gen = function*() {
    var state = 0;
    while (1) {
        yield state * state;
        state++;
    }
};

(I'll assume the existence of generators in examples henceforth.)

The send() function is interesting -- it allows you to set the state from which the generator continues.


square_gen.send(16); /* => 256 */

The key idea is that generators will only do one iteration of a possibly large computation. When yield is reached, the function exits, and the computation is started up again after the yield statement when the caller performs another next() or a send().

Now, suppose you need to do something that takes a while, without waiting, like fetching a resource over the network. When the resource is available, you need to continue with that data. In the meantime you should try to do something else, like allow the UI to render itself.

The normal way one does this is by writing one's code in the (horrible) continuation-passing-style, which is fancy talk for passing callbacks to all functions. The task.js way is better: write functions that can block (generators) and switch to other tasks when they do block. No, I'm not making this up -- it is normal for a program in JS land to include a half-baked cooperative multitasker.

You can turn callbacks into promises, as in "I promise to give you value X when Y is done." Promise objects are getting baked into various runtimes, but rolling your own is also easy:


function MyPromise() {
    this.todo = function(){};
}

MyPromise.prototype.then = function(f) {
    this.todo = f;
};

MyPromise.prototype.resolve = function(value) {
    this.todo(value);
};

That is: register a callback with then(), to be called when a value is known. The value is set by calling resolve().

Now, suppose you have an async library function that takes a callback:


function example(call_when_done) {
    /* do something asynchronously */
    call_when_done(value);
}

Instead of calling it like this:


    example(function(value) { /* my continuation */ });

...you can give it a callback that resolves a promise, which, when resolved, calls your continuation:


    var p = new MyPromise();
    example(function(value) { p.resolve(value); });
    p.then(function(value) { /* my continuation */ });

This is just obfuscation so far. The usefulness comes when you rewrite your functions to return promises, so that:


    function my_fn(params, callback) { /* pyramid of doom */ }

becomes:


    function my_fn(params) { /* ... */ return promise; }

Now you can have generators yield promises, which will make them exit until another next() or send() call. And then you can arrange to call send() with the promise's value when it is resolved, which gives you blocking functionality.


    /* An async function that returns a promise */
    function async() {
        var p = new MyPromise();
        example(function(v) { p.resolve(v); });
        return p;
    }

    /* A generator which blocks */
    var gen = function*() {

        /* does some async call and receives a promise */
        var p = async();

        /* blocks until async call is done */
        var v = yield p;

        /* now free to do something with v... */
    }();

    /*
     * Run one iteration of the generator.  If the
     * generator yields a promise, set up another
     * iteration when that promise is resolved.
     */
    function iterate(gen, state) {
        var p = gen.send(state);
        p.then(function(v) {
            iterate(gen, v);
        });
    }

    /* Start a task */
    var p = gen.next();
    p.then(function(v) {
        iterate(gen, v);
    });

    /*
     * You can do something else here, like start another
     * generator.  However, control needs to eventually
     * return to runtime main loop so that async() can
     * make progress.
     */

A lot of details are omitted here like StopIteration and error handling, but that's the gist. Task.js generalizes a lot of this and has a decent interface, so, do that.

In summary, Javascript is terrible (only slightly less so without c-p-s), and I'm glad I only have to use it occasionally.