functional bitrate sim

My wmediumd rewrite is a bit further along thanks to getting a few hours to hack on it this weekend. It can now accurately simulate throughput between a pair of radios using legacy rates. For example, if we set the SNR between two devices to 20 dB, then they can communicate at a nominal 54 mbps rate, yielding about 26 Mbps achieved in iperf:

[  3]  0.0-10.0 sec  31.2 MBytes  26.1 Mbits/sec

At 15 dB, we can send between 24 and 36 Mbps nominal rates, which yields:

[  3]  0.0-10.1 sec  21.0 MBytes  17.5 Mbits/sec

Note that achieved throughput is quite a bit lower than nominal, as in real life — if aggregation were implemented then they would be closer.

The basic architecture is pretty simple: frames are queued on a per-sender management or data queue depending on type, and delivery time is computed based on whether or not there is loss and the contention window parameters of the queues. A timerfd is used to schedule reporting of frame delivery back to the kernel at appropriate times. The delivery time does not take into account actual contention, although this could be done in principle by looking at all the queued frames for all stations.

I haven’t really decided what to do about configuration. I stripped out the jamming and probability matrix configurations, as I feel like doing things on a signal level basis are simpler. But at this point there’s no real way to specify signal levels either (other than hardcoding), and some scenarios probably want something dynamic (e.g. mobile stations).

Changes are in my wmediumd master branch. Unfortunately, I won’t have much time to work on this for the next two months, but patches for the many TODOs are welcome.

wmediumd speed test

Thanks to some inquries on linux-wireless, I took a look at wmediumd recently. The code could use a bit of work, and there are some features I’ve been meaning to add since forever, so I started gutting it with an eye towards sprucing up the architecture and feature set (changes can be found here).

One of the questions from the mailing list was whether wmediumd adds a lot of overhead compared to mac80211_hwsim. It is of course doing more work, with additional memory copies, context switches, etc — but is it enough to make wmediumd unworkable?

So I did a quick TCP iperf test on my laptop with an open mesh, and get the following numbers.

hwsim without wmediumd:

    [  3]  0.0-10.0 sec  1.36 GBytes  1.16 Gbits/sec

hwsim with wmediumd:

    [  3]  0.0-10.0 sec  1.27 GBytes  1.09 Gbits/sec

It looks like wmediumd is doing fine. This is with monitors running, the non-monitor case does about twice that. Actually, I think this is a bit lower than it should be, but considering both cases are close, and a good deal faster than your typical wifi connection, it’s probably good enough for some level of bandwidth simulation.

wpas mesh

Continuing where I left off with my OpenWRT mesh nodes, after installing the OS, the next step is to get a mesh-enabled userspace on them.

One can use iw to create an open mesh, and the authsae daemon for secure mesh, and OpenWRT already ships both of those, so just installing those packages is really all that is required.

However, I’m currently working on a patchset to add mesh support to wpa_supplicant, which could be useful for platforms where wpa_s is already present and running yet another daemon just for secure mesh is unpalatable. Here’s the recipe I’m using to keep the latest version on the device and use it for day-to-day activities.

Since OpenWRT can use git as a package source and already does so for hostapd, building a custom wpa_supplicant is mainly a matter of just changing the git repository url and config. I made the following changes in the package/network/services/hostapd directory:

diff --git a/package/network/services/hostapd/Makefile b/package/network/services/host
index 6872742..5985339 100644
--- a/package/network/services/hostapd/Makefile
+++ b/package/network/services/hostapd/Makefile
@@ -10,10 +10,10 @@ include $(TOPDIR)/rules.mk
 PKG_NAME:=hostapd
 PKG_VERSION:=2014-06-03
 PKG_RELEASE:=1
-PKG_REV:=84df167554569af8c87f0a8ac1fb508192417d8e
+PKG_REV:=prepare-submit-v12
 
 PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.bz2
-PKG_SOURCE_URL:=git://w1.fi/srv/git/hostap.git
+PKG_SOURCE_URL:=https://github.com/cozybit/wpa_supplicant.git
 PKG_SOURCE_SUBDIR:=$(PKG_NAME)-$(PKG_VERSION)
 PKG_SOURCE_VERSION:=$(PKG_REV)
 PKG_SOURCE_PROTO:=git
diff --git a/package/network/services/hostapd/files/wpa_supplicant-full.config b/packa
index bbfaa73..4d9e00e 100644
--- a/package/network/services/hostapd/files/wpa_supplicant-full.config
+++ b/package/network/services/hostapd/files/wpa_supplicant-full.config
@@ -407,3 +407,9 @@ CONFIG_NO_RANDOM_POOL=y
 NEED_80211_COMMON=y
 
 CONFIG_IBSS_RSN=y
+
+CONFIG_AP=y
+CONFIG_P2P=y
+CONFIG_TDLS=y
+CONFIG_SAE=y
+CONFIG_MESH=y

(Offhand, I don’t know if P2P and TDLS are really required, but as it matches my existing config, we’ll go with that.)

You’ll also need to enable CONFIG_WPA_SUPPLICANT_OPENSSL=y in the OpenWRT menuconfig in order for SAE to link properly.

Rebuilding from scratch looks like this:

rm dl/hostapd-*.tar.bz2
make package/hostapd/{download,prepare,clean,compile,install} V=s

Once built, I have a simple script which copies over the bin/x86_64/packages/{hostapd*,wpa-s*} files and then runs opkg install on each of the nodes.

To start the mesh, I use the following script:

#!/bin/bash
pubip=`ip route get 8.8.8.8 | awk 'NR==1 {print $NF}'`
last8=`echo $pubip | awk -F . '{print $4}'`
meship=10.10.1.$last8
iface=wlan0

cat<<__EOM > wpa_s.conf
network={
    ssid="your-meshid-here"
    mode=5
    frequency=2412
    key_mgmt=SAE
    psk="your-pw-here"
}
__EOM
ip addr flush $iface
ip link set $iface down
iw dev $iface set type mp
ip link set $iface up
ip addr add $meship/24 dev $iface

killall wpa_supplicant
wpa_supplicant -dd -i $iface -c wpa_s.conf >wpa_s.log 2>&1 &

In response to the previous blog post, Johannes Berg pointed out that running nfsroot and PXE booting these devices would be even easier than futzing with USB sticks and copying binaries back and forth. Unfortunately, the BIOS on these machines doesn’t appear to support netboot, and at least for now, I can’t be bothered to figure out how to do it from within grub. At any rate, I find this setup makes for a fairly painless compile / deploy / test cycle.

Zotac OpenWRT

ZOTAC! The Zotac NM10-ITX is a mini-ITX motherboard, which in my configuration has a 1.66 GHz Atom D510 on board, 8 GB SSD, 2 GB ram, and a pair of 2×2 ath9k devices. I wound up with a few of these boxes as a mesh testbed due to my work with Cozybit. Until recently, they ran distro11s, which is basically Debian with some mesh/wireless utilities and some custom init scripts thrown on top. I was looking to re-image them, and I believe distro11s is not actively maintained. The boxes are beefy enough to run unmodified Debian, but OpenWRT has various niceties when building all things wireless from source, and I might like to run the same setup on more constrained devices, so I went with that.

Building OpenWRT from git is a rather simple affair [1]. In my case, I made a few trips through make kernel_menuconfig to get the right config for a properly booting kernel (namely, CONFIG_ATA_PIIX, CONFIG_INPUT_EVDEV, CONFIG_USB_HID, CONFIG_HID_GENERIC, and CONFIG_R8169 were needed for functional disk, keyboard, and network on boot). I also customized the network setup via files/etc/config/network so that eth0 would come up with DHCP rather than a fixed IP at 192.168.1.1.

Once built, one needs a way to copy the OpenWRT image onto the drive. Enter Bootable USB Stick.

I recently procured a speedy, spacious USB stick in order to run a sizeable VM on my disk-space-poor laptop. As I already had an ext4 partition on the USB stick, making it into a bootable rescue/installer OS was mostly a matter of debootstrap [2]. I gave it a user account, put my ssh keys on it, and also configured it to start up with DHCP on the first interface. Among other things, that means linking /etc/udev/rules.d/70-persistent-net.rules to /dev/null so that, as the rescue OS is booted on multiple machines, the first NIC remains named eth0.

On top of the base install, I copied the OpenWRT combined disk image onto the stick. Imaging a new machine then involves booting off the USB drive, overwriting the main block device with the disk image, and then resizing the root partition to use the full drive:

#!/bin/bash
gzip -dc /home/bob/openwrt-x86_64-combined-ext4.img.gz > /dev/sda
p2start=`sfdisk -d /dev/sda | grep sda2 | awk '{print $4}' | sed -e "s/,//"`
cat<<__EOM__ | fdisk /dev/sda
d
2
n
p
2
$p2start

w
__EOM__
resize2fs /dev/sda2
# may customize root image here for each device by mounting it, etc.

You can build disk images that match your disk size and skip repartitioning and resize2fs, but in the case that the disk is large (like mine), zero-filling all the unused space is a big waste of time. I'm sure there is a smart way to use sparse images to overcome this, but fdisk/resize2fs is the simplest thing that works for me.

Since I have my local DHCP server set up to assign known addresses and DNS names to these machines based on their MAC address, I can do the installs without a console on each machine: plug in the stick, reboot, ssh into it when it comes up, run my imaging script, shutdown and pull the stick. Easy!

Tablet guts

Tablet gutsIn case you want to know what the inside of a Samsung Galaxy Tab 3.0 looks like, I propose the following experiment: unplug one, and let it sit unused and un-powered for two months. You will then find it in the state where disconnecting and reconnecting the battery is required to get it to charge again. I am not making this up. Oh, and opening it is pretty tricky even with the proper plastic tools to do so. Since I have two of them, that’s how I lost thirty minutes of my life on Saturday.

As for why I have two: this particular model has a Marvell SoC inside, and the wireless SD8787 peripheral can be used with the upstream cfg80211-based mwifiex driver. For cozybit, I helped write an alternative mac80211 driver that can run mesh. It was a little slower and more power hungry than mwifiex, but in addition to being mesh/ibss/AP capable, had some nice-for-development features like SDIO tracing and an nl80211-testmode interface that could run firmware commands from userspace, upon which we built some test scaffolding. You can get the code for that driver today, but development is pretty much at a dead end because we needed to extend the firmware for host-based operation, and there will probably never be a redistributable firmware at this point.

I’d love to have an open firmware for the device, but as I have seen and touched the NDA encumbered firmware, it’s unlikely that I can have any hand in bringing that about.

++router

As a sometime wireless hacker, it’s a bit embarrassing to admit that I’ve had the factory firmware on my wifi router all this time, but when I first tried OpenWrt on it, ath9k was only a few months old and dropped connections all the time. Thus, I made do with the factory install but ran many of the essential services (dns, dhcp, tftp, etc) from my Linux workstation. And life continued apace.

After a recent network upgrade, I found I could no longer make my router understand ipv6, and so it was time to put the original firmware to pasture. In the intervening years, ath9k grew up, so I took another try with OpenWrt. The install took about 20 minutes, most of which was configuring the firewall and copying my existing dnsmasq config into a uci-friendly format. Everything works great and my ipv6 is back. Nice job, all involved!

I suppose I could now eat even more dogfood by running a mesh interface on one of the radios. In the past, I’ve tinkered with mesh as a wireless distribution system, but I don’t have much of a use for that currently with every room in the new place being wired. Perhaps my backyard could use expanded coverage?

vim cheat codes

Or, how I read parts of the fine manual.

Yesterday, after spending way too much time trying to get find(1) to exclude vim swapfiles, I finally had it with them cluttering up my work directories. As is usually the case when vim triggers an itch, I thought, “there must be a setting for that,” and lo, there was.

set directory=~/.vimswap//

Make that directory and all the swap files go there instead. The trailing double slashes mean the swap files get named in such a way as to avoid conflicts.

Here are some other things added to my vimrc over the last year or so.

Show whitespace issues as spelling mistakes:

match SpellBad /s+$| +zet/

I used to always set my windows to 80 columns, but then I started using ‘set number’ and then all of that went out the window, so to speak. So now I do this to show where wrapping needs to happen:

set colorcolumn=80

This hack is kind of neat, it shows +/- change markers on the edge, based on git changes in your working copy:

https://github.com/airblade/vim-gitgutter

(To make it less intrusive, I added highlight clear SignColumn.)

Once upon a time, I had a really complicated macro to search up the directory hierarchy looking for tags files. It turns out vim already does that if you add a semicolon in there (:help file-searching):

set tags=./tags;

Phone bugs

If you want to experience what it is like to be the first person to ever do something, might I suggest turning on all the debug knobs on an Android vendor kernel? Some speculative fixes over here, but there is still at least one other RCU bug on boot, and this use-after-free in the usb controller driver:

[   34.520080] msm_hsic_host msm_hsic_host: remove, state 1
[   34.525329] usb usb1: USB disconnect, device number 1
[   34.529602] usb 1-1: USB disconnect, device number 2
[   34.637023] msm_hsic_host msm_hsic_host: USB bus 1 deregistered
[   34.668945] Unable to handle kernel paging request at virtual address aaaaaae6
[   34.675201] pgd = c0004000
[   34.678497] [aaaaaae6] *pgd=00000000
[   34.681762] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[   34.686737] Modules linked in: wcn36xx_msm(O) wcn36xx(O) mac80211(O) cfg80211(O) compat(O)
[   34.694976] CPU: 1    Tainted: G        W  O  (3.4.0-g4a73a1d-00005-g311eaee-dirty #2)
[   34.702972] PC is at __gpio_get_value+0x28/0x1cc
[   34.707489] LR is at do_restart+0x24/0xd8
[   34.711486] pc : []    lr : []    psr: 60000013
[   34.711486] sp : ebcd9ed8  ip : c032a858  fp : ebcd9ef4
[   34.722930] r10: 00000000  r9 : c04dd67c  r8 : 00000000
[   34.728210] r7 : c4d81f00  r6 : 6b6b6b6b  r5 : c10099cc  r4 : aaaaaaaa
[   34.734680] r3 : 09090904  r2 : c1672f80  r1 : 00000010  r0 : 6b6b6b6b
                                                                 ^^^^^^^^ whoops
[   34.741241] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[   34.748504] Control: 10c5787d  Table: acfb006a  DAC: 00000015

I guess I should try to find out where to report bugs for these (predictably) not upstream drivers, but that seems like a total pain.

Footprint, Part 2

My recent posting of ASCII art was intentionally subtle, perhaps overly so. If you haven’t figured it out by now, it is a C program that announces the birth of our second child. When compiled and executed, it prints:

Ian Yit-Sing Copeland
Born 5 August, 2013 at 02:49
Weight 6 lbs 11 oz

In this post, I will explain how I created it.

Like many a C practitioner, I’ve always found International Obfuscated C Code Contest entries to be at once entertaining and mystifying. While I don’t consider this effort to be near the level of IOCCC entries, I thought it might be fun to try my own IOCCC-lite ASCII art program.

I knew I wanted the program to simply print an announcement string, in a non-obvious fashion. It also had to be easy to modify the program with the actual details on the day of the delivery. Indeed, I wrote this program several weeks in advance and modified the output in only a few minutes.

Thanks to the Can You Crack It challenge, I already had an RC4-like algorithm sitting on my disk. The plan then was simple: embed the key and ciphertext in the program, and just decrypt and print it. Of course, the ciphertext would be all binary, which is hard to encode in a compact manner in the source. Thus, I decided to store the ciphertext in base32 encoding.

I actually didn’t put much effort into obfuscating the code; the main issue was getting the size of the source code into around 600 characters, and doing that involved using typedefs and one-letter variable names already. By the time that was done, I didn’t change much. Apart from variable naming, the main obfuscation step was to change the base32 mapping table to consist only of numbers and symbols so that the embedded string would itself look like code. The code otherwise is pretty straight-forward.

My starting point, which base32-decoded and decrypted some placeholder text, looked like this:

#include <string.h>
#include <stdio.h>

char str[] = "43(:7&!#3&@>$%^|]:&6;<7*-$}9{;!*!$5{<;-=^8]#:5<@#%#&!5!@40207#9($3&)7<$1";

int b32dec(char *in, char *out, int len)
{
    int i;
    char alpha[] = "3:5[9&2^]{7}*<-8@=4(6#!>|)0+;1$%";
    for (i=0; i < len; i++)
    {
        unsigned char bits = strchr(alpha, in[i])-alpha;
        int j = (i / 8) * 5;
        switch (i % 8) {
            case 0:
                out[j + 0] |= bits << 3;
                break;
            case 1:
                out[j + 0] |= bits >> 2;
                out[j + 1] |= bits << 6;
                break;
            case 2:
                out[j + 1] |= bits << 1;
                break;
            case 3:
                out[j + 1] |= bits >> 4;
                out[j + 2] |= bits << 4;
                break;
            case 4:
                out[j + 2] |= bits >> 1;
                out[j + 3] |= bits << 7;
                break;
            case 5:
                out[j + 3] |= bits << 2;
                break;
            case 6:
                out[j + 3] |= bits >> 3;
                out[j + 4] |= bits << 5;
                break;
            case 7:
                out[j + 4] |= bits;
                break;
        }
    }
    return i/8 * 5;
}


int keysched(unsigned char *x, unsigned char *key, int keylen)
{
    int i;
    unsigned char tmp, a = 0;

    for (i=0; i < 256; i++)
        x[i] = i;

    for (i=0; i < 256; i++)
    {
        a += x[i] + key[i % keylen];
        tmp = x[i];
        x[i] = x[a];
        x[a] = tmp;
    }
}

int crypt(unsigned char *x, unsigned char *y, int len)
{
    unsigned char a;
    unsigned char b = 0;
    unsigned char tmp;
    int i;

    for (i=0; i < len; i++)
    {
        a = i+1;
        b += x[a];
        tmp = x[a];
        x[a] = x[b];
        x[b] = tmp;
        y[i] ^= x[(x[a] + x[b]) & 0xff];
    }
}

int main()
{
    unsigned char x[256];
    unsigned char key[] = "abcd";
    unsigned char crypt_text[sizeof(str)] = {};
    int len;

    len = b32dec(str, crypt_text, strlen(str));
    keysched(x, key, strlen(key));
    crypt(x, crypt_text, len);
    printf("%sn", crypt_text);
}

Micro-optimizing for source code size is unusual, and in some ways backwards to optimizing for generated code. For example, this change saved a couple of characters, but in the opposite way frequently done:

-            *o++ = d[0] << 3 | d[1] >> 2;
+            *o++ = d[0] * 8 | d[1] / 4;

Similarly, I found that combining unrelated functions was useful in eliminating the character waste of function definitions. There are likely a good deal more space-savers to be found; I quit when I got it small enough.

I experimented with a few different formats for the code. It turns out that I'm not terribly good at drawing ASCII. I abandoned a baby bottle as unrecognizable and went with a footprint after seeing this motif on some birth announcements. I hand-drew it, scanned it in, loaded as a background in an html document and then put dollar signs in the right places. Yes, cheating, but I am no artist.

                                          $$$$
                                        $$$$$$$$
                              $$$      $$$$$$$$$
                        $$   $$$$$    $$$$$$$$$
                       $$$$   $$$     $$$$$$$$$
                       $$$$             $$$$$$
                  $$   $$
                 $$$$            $$$$$$
             $  $$$$$       $$$$$$$$$$$$$
           $$$$  $$$$    $$$$$$$$$$$$$$$$$
           $$$$       $$$$$$$$$$$$$$$$$$$$$
           $$$$     $$$$$$$$$$$$$$$$$$$$$$$
            $$    $$$$$$$$$$$$$$$$$$$$$$$$$
                $$$$$$$$$$$$$$$$$$$$$$$$$$$
               $$$$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$$
              $$$$$$$$$$$$$$$
               $$$$$$$$$$$$$
               $$$$$$$$$$$$
                 $$$$$$$$$
                    $$$$

I wrote a python script to encrypt and encode the string to be embedded in the code. A second script tokenized the original C source and replaced dollar signs from the text template with code. The latter script is not perfect, but worked well enough that I could hand-edit the output in a few seconds to get something reasonable.

The gory details are available at github.

Finally, on delivery day, I discovered that WordPress mangled C code included in posts. So I simply took a screenshot of the output and posted that instead, with a link to the original text. A picture of the newborn was hidden as a hyperlink from the footprint image.

Overall, the entire process took a couple of evenings from concept to completion, and I'm quite happy with the way it turned out. And, of course, I'm ecstatic about the inspiration: our newly arrived baby boy. We are very fortunate to have both Alex and now Ian in our lives, and silly C programs and fake man pages cannot come close to capturing our joy.

It’s faster because it’s async!

Some Javascript developers, when presented with an API, think, “I know, I’ll make it async!” Now their users have a giant mess of a state machine implemented with callbacks and recursion, in a GC runtime that doesn’t eliminate tail calls, and two orders of magnitude worse performance. Normally, I wouldn’t care, because you get what you ask for when you write something in Javascript, but this time it happened to a project that I occasionally maintain for $work. And so I was sad.

So, far from expert in the ways of JS, I looked for a way out of this mire, and stumbled across task.js, which is what the kids are doing these days. It is a neat little hack, although it only works if your JS engine supports generators (most do not). Still, it seemed like a reasonable approach for my problem, so I decided to understand how task.js works by deriving my own.

I should note that if one’s JS engine doesn’t have generators, one can kind of emulate them, in the same sense that one can emulate goto with switch statements. Consider this simple example:


var square_gen = function() {
    var obj = {
        state: 0,

        next: function() {
            var ret = this.state * this.state;
            this.state++;
            return ret;
        },

        send: function(v) {
            this.state = v;
            return this.next();
        },
    };
    return obj;
};

var gen = square_gen();
for (var i=0; i < 4; i++) {
    console.log("The next square is: " + gen.next());
}
/*
Outputs:
The next square is: 0
The next square is: 1
The next square is: 4
The next square is: 9
*/

In other words, square_gen() returns an object that has a next() method which returns the next value, based on some internal state. The next() method could be a full-fledged state machine instead of a simple variable.

Certainly this is less tidy than the equivalent with language support:


var square_gen = function*() {
    var state = 0;
    while (1) {
        yield state * state;
        state++;
    }
};

(I'll assume the existence of generators in examples henceforth.)

The send() function is interesting -- it allows you to set the state from which the generator continues.


square_gen.send(16); /* => 256 */

The key idea is that generators will only do one iteration of a possibly large computation. When yield is reached, the function exits, and the computation is started up again after the yield statement when the caller performs another next() or a send().

Now, suppose you need to do something that takes a while, without waiting, like fetching a resource over the network. When the resource is available, you need to continue with that data. In the meantime you should try to do something else, like allow the UI to render itself.

The normal way one does this is by writing one's code in the (horrible) continuation-passing-style, which is fancy talk for passing callbacks to all functions. The task.js way is better: write functions that can block (generators) and switch to other tasks when they do block. No, I'm not making this up -- it is normal for a program in JS land to include a half-baked cooperative multitasker.

You can turn callbacks into promises, as in "I promise to give you value X when Y is done." Promise objects are getting baked into various runtimes, but rolling your own is also easy:


function MyPromise() {
    this.todo = function(){};
}

MyPromise.prototype.then = function(f) {
    this.todo = f;
};

MyPromise.prototype.resolve = function(value) {
    this.todo(value);
};

That is: register a callback with then(), to be called when a value is known. The value is set by calling resolve().

Now, suppose you have an async library function that takes a callback:


function example(call_when_done) {
    /* do something asynchronously */
    call_when_done(value);
}

Instead of calling it like this:


    example(function(value) { /* my continuation */ });

...you can give it a callback that resolves a promise, which, when resolved, calls your continuation:


    var p = new MyPromise();
    example(function(value) { p.resolve(value); });
    p.then(function(value) { /* my continuation */ });

This is just obfuscation so far. The usefulness comes when you rewrite your functions to return promises, so that:


    function my_fn(params, callback) { /* pyramid of doom */ }

becomes:


    function my_fn(params) { /* ... */ return promise; }

Now you can have generators yield promises, which will make them exit until another next() or send() call. And then you can arrange to call send() with the promise's value when it is resolved, which gives you blocking functionality.


    /* An async function that returns a promise */
    function async() {
        var p = new MyPromise();
        example(function(v) { p.resolve(v); });
        return p;
    }

    /* A generator which blocks */
    var gen = function*() {

        /* does some async call and receives a promise */
        var p = async();

        /* blocks until async call is done */
        var v = yield p;

        /* now free to do something with v... */
    }();

    /*
     * Run one iteration of the generator.  If the
     * generator yields a promise, set up another
     * iteration when that promise is resolved.
     */
    function iterate(gen, state) {
        var p = gen.send(state);
        p.then(function(v) {
            iterate(gen, v);
        });
    }

    /* Start a task */
    var p = gen.next();
    p.then(function(v) {
        iterate(gen, v);
    });

    /*
     * You can do something else here, like start another
     * generator.  However, control needs to eventually
     * return to runtime main loop so that async() can
     * make progress.
     */

A lot of details are omitted here like StopIteration and error handling, but that's the gist. Task.js generalizes a lot of this and has a decent interface, so, do that.

In summary, Javascript is terrible (only slightly less so without c-p-s), and I'm glad I only have to use it occasionally.