New Directions in Commandline Editing

Update: I finally isolated this to gnome’s multiple input method support. Shift + Space is bound by default to switch input sources, and it is in whatever the next input method is that unusual things happen. Turn that stuff off in the keyboard shortcuts! Find below the original post about the visible symptoms.

Dear lazyweb,

Where in {bash, readline, gnome terminal, wayland, kernel input} does the following brain damage, newly landed in debian testing, originate?

  • pressing ctrl-u no longer erases to beginning of line, but instead starts unicode entry with a U+ prompt, much like shift-ctrl-u used to do, except the U is upper-cased now
  • hitting slash twice rapidly underlines the first slash (as if to indicate some kind of special entry prompt, but one I’m unfamiliar with and I cannot get it to do anything useful anyhow), and then just eats the second slash and the underline goes away
  • characters typed quickly (mostly spaces) get dropped

This stuff is so hard to google for and it is killing my touch typing. Email me if you know, and I’ll update this accordingly so that maybe google will know next time.

Encrypted mesh PSA

I’m a bit late writing this up, as lately wifi stuff has taken a back seat to the day job. But, I’ve now seen the following issue reported more than once, so hopefully this post will save the other two mesh users some grief.

If you are running an encrypted mesh on wpa_supplicant or (ye gods) authsae, take note:

  • Encrypted mesh on kernel 4.8 will not interoperate with kernel < 4.8
  • wpa_supplicant 2.6 will not interoperate with 2.5
  • authsae as of commit 706a2cf will not interoperate with that of commit 813ec0e [1]
  • wpa_supplicant 2.6 and authsae commit 706a2cf require additional configuration to work with kernel < 4.8 (see below).

What is behind all of that? A while ago I noticed and mentioned to another networking developer that the 802.11 standard calls for certain group-addressed management frames in mesh networks to be encrypted with the mesh group key (MGTK). However, the implementation in the kernel only integrity-protected these frames, using the integrity group key (IGTK) [2]. I don’t happen to know the history here: it may be that this was something that was changed between the draft and the standard, but anyway that was the state until Masashi Honma fixed it, with patch 46f6b06050b that landed in kernel 4.8.

Meanwhile, Jouni Malinen fixed a bunch of related issues in the wpa_supplicant implementation, namely that IGTK was not being generated properly and instead the MGTK was used for that. I made the same kind of fixes for authsae.

Yes, it’s a multiple(!) flag day and we suck for that.

Now, on to how to fix your mesh.

Old mesh daemon + old kernel everywhere should still work.
Old mesh daemon + new kernel everywhere should still work.
New mesh daemon + new kernel everywhere should still work.

New mesh daemon + pre-4.8 kernel: you will most likely see that peering works but no data goes across because HWMP frames are dropped at the peer. The solution here is to enable PMF in the mesh daemon:

  • add “pmf=2” or “ieee80211w=2” to the relevant section of wpa_supplicant.conf
  • add “pmf=1;” to the meshd section of authsae.cfg

If you for some reason find yourself in the position of needing to run a mix of old/new wpa_s and/or old/new kernel, here are my (completely untested) suggestions:

  • patch the kernel to accept PMF-protected frames as well as encrypted frames (this allows older kernels to work with newer ones)
  • patch the new mesh daemon (wpa_supplicant or authsae) to copy the MGTK into the IGTK instead of generating a separate key (this allows older mesh daemon to work with the newer one since older daemon will use MGTK for IGTK).

Both of these will reduce your security and don’t follow the spec, so there’s little chance such changes will go upstream.

[1] Prior to 706a2cf, authsae doesn’t even interoperate with wpa_supplicant due to not filling in all of the elements in peering frames. My advice would be to switch to wpa_supplicant, which has things like a maintainer and a test suite.
[2] On encryption vs integrity protection: the latter is similar to PGP-signing of emails: it makes tampering evident but does not hide the content of the message itself.

wherein I did a git push

As I wrote somewhere in the Metaverse, I’m running the wireless-testing tree now, which is about as minimal a contribution as one could make. It essentially means running a script once daily; cron could do it all, if not for the part where I enter my ssh passphrase to actually push out the updates. For the curious, here’s the script.

One of the killer features of git used here is git rerere, which remembers conflict resolutions and reapplies them when reencountered. A human (me) still needs to resolve the conflict once, but after that it’s pretty much on autopilot for the rest of that kernel series. Normally one should reinspect the automatic merge made by git-rerere, but at least for this testing tree, throwing caution to the wind is warranted and thus the script has some kludges around that.

Every pushed tag had a successful build test with whatever (not claimed to be exhaustive) kernel config I am using locally. I’ve considered some more extensive tests here, such as running the hostapd test suite (takes an hour with 8 VMs on my hardware) or boot testing on my mesh APs (requires a giant stack of out-of-tree patches), but, given the parentheticals, I’m keeping it simple for now.

Observant persons may have noted some missing dates in the tags. Generally, this means either nothing actually changed in the trees I’m pulling, or it is during the merge window when things are too in flux to be useful. Or maybe I forgot.

Weekend updates

More updates to speccy pushed to github this weekend. It gained the ability to toggle on-and-off line and scatter plots. Also, it now takes care of enabling spectral scan and then scanning channels (by exec()-ing iw, naughty me) so you can just tell it the device and it does the rest. Code is still ugly but gradually taking shape. There’s an animated gif for those that want to see what it currently looks like in operation.

I settled on (somewhat inefficiently) smoothing the line graph with a centered moving average using a 5-sample window length, which met the strict theoretical basis of “it looked kind of right.” I did not yet incorporate the time-based plot I was playing with.

It could certainly be faster. Profiling shows that the vast majority of time is spent drawing 9-pixel rectangles in Cairo. I suppose that even if Cairo was using an accelerated backend, which I don’t believe it is in my case, that this would be something more efficiently done by rendering to a client pixmap anyway. Lots of low-hanging fruit everywhere, but I’m resisting the urge to really optimize anything at this early stage.

Nicer serialAlso yesterday I finally cleaned up the somewhat messy serial port installation on one of my ath10k routers with the help of a drill. These cheap cables work well but do require a bit of attention with the glue gun — I find the plastic case around the USB port / converter board is always coming un-snapped and falling off otherwise.

time vs frequency

This is roughly what I had in mind for the time/frequency plot for my little spectrum visualizer. Time is on the y-axis and frequency on the x-axis. There are some interesting artifacts here, especially on the higher channels. Probably bugs in my implementation, but we’ll see as I develop it further.

Speccy with labels

I messed around with “speccy” a bit today to see what I could do with it to make it look slightly nicer. It really needed a grid so one can more easily see the frequencies and power levels. So now it looks like this:

(This is with iperf running on channel 6 to a nearby AP).

I have in mind a couple of other visualizations. An obvious one is to show the overall max or average power level rather than a scatter plot. This would be a cleaner display, although there could be co-channel interference sources you wouldn’t see that way. Here’s a first cut at that:

I am doing some averaging for multiple (~8) samples with the same subcarrier frequency, but clearly more smoothing is needed. Also I notice that I occasionally get some artifact where all the subcarriers have the exact same power level. Not sure what that is, but it wants filtering.

Another interesting visualization would be a scrolling chart of frequency vs time, with power level indicated by pixel intensity. Then you could get an idea of historical changes in the medium. (Oh hey, the neighbors are watching Netflix now.)

The code is still an ugly pile of hacks, which it will probably continue to be until I decide just what kind of processing needs doing on the raw samples. Sorry for that.

Zotac OpenWRT

ZOTAC! The Zotac NM10-ITX is a mini-ITX motherboard, which in my configuration has a 1.66 GHz Atom D510 on board, 8 GB SSD, 2 GB ram, and a pair of 2×2 ath9k devices. I wound up with a few of these boxes as a mesh testbed due to my work with Cozybit. Until recently, they ran distro11s, which is basically Debian with some mesh/wireless utilities and some custom init scripts thrown on top. I was looking to re-image them, and I believe distro11s is not actively maintained. The boxes are beefy enough to run unmodified Debian, but OpenWRT has various niceties when building all things wireless from source, and I might like to run the same setup on more constrained devices, so I went with that.

Building OpenWRT from git is a rather simple affair [1]. In my case, I made a few trips through make kernel_menuconfig to get the right config for a properly booting kernel (namely, CONFIG_ATA_PIIX, CONFIG_INPUT_EVDEV, CONFIG_USB_HID, CONFIG_HID_GENERIC, and CONFIG_R8169 were needed for functional disk, keyboard, and network on boot). I also customized the network setup via files/etc/config/network so that eth0 would come up with DHCP rather than a fixed IP at 192.168.1.1.

Once built, one needs a way to copy the OpenWRT image onto the drive. Enter Bootable USB Stick.

I recently procured a speedy, spacious USB stick in order to run a sizeable VM on my disk-space-poor laptop. As I already had an ext4 partition on the USB stick, making it into a bootable rescue/installer OS was mostly a matter of debootstrap [2]. I gave it a user account, put my ssh keys on it, and also configured it to start up with DHCP on the first interface. Among other things, that means linking /etc/udev/rules.d/70-persistent-net.rules to /dev/null so that, as the rescue OS is booted on multiple machines, the first NIC remains named eth0.

On top of the base install, I copied the OpenWRT combined disk image onto the stick. Imaging a new machine then involves booting off the USB drive, overwriting the main block device with the disk image, and then resizing the root partition to use the full drive:

#!/bin/bash
gzip -dc /home/bob/openwrt-x86_64-combined-ext4.img.gz > /dev/sda
p2start=`sfdisk -d /dev/sda | grep sda2 | awk '{print $4}' | sed -e "s/,//"`
cat<<__EOM__ | fdisk /dev/sda
d
2
n
p
2
$p2start

w
__EOM__
resize2fs /dev/sda2
# may customize root image here for each device by mounting it, etc.

You can build disk images that match your disk size and skip repartitioning and resize2fs, but in the case that the disk is large (like mine), zero-filling all the unused space is a big waste of time. I'm sure there is a smart way to use sparse images to overcome this, but fdisk/resize2fs is the simplest thing that works for me.

Since I have my local DHCP server set up to assign known addresses and DNS names to these machines based on their MAC address, I can do the installs without a console on each machine: plug in the stick, reboot, ssh into it when it comes up, run my imaging script, shutdown and pull the stick. Easy!

++router

As a sometime wireless hacker, it’s a bit embarrassing to admit that I’ve had the factory firmware on my wifi router all this time, but when I first tried OpenWrt on it, ath9k was only a few months old and dropped connections all the time. Thus, I made do with the factory install but ran many of the essential services (dns, dhcp, tftp, etc) from my Linux workstation. And life continued apace.

After a recent network upgrade, I found I could no longer make my router understand ipv6, and so it was time to put the original firmware to pasture. In the intervening years, ath9k grew up, so I took another try with OpenWrt. The install took about 20 minutes, most of which was configuring the firewall and copying my existing dnsmasq config into a uci-friendly format. Everything works great and my ipv6 is back. Nice job, all involved!

I suppose I could now eat even more dogfood by running a mesh interface on one of the radios. In the past, I’ve tinkered with mesh as a wireless distribution system, but I don’t have much of a use for that currently with every room in the new place being wired. Perhaps my backyard could use expanded coverage?

Fake Wireless Errors

When I did my previous work with mac0211_hwsim, I wrote the channel model in matlab and pre-generated a huge lookup table of frame error rates for different SNR values and transmission rates so that the simulator didn’t have to do any thinking for each packet. Obviously that’s a bit limiting and not in any way upstreamable in something like wmediumd.

So, I sat down today and rewrote it in C to see how bad the computation is. Actually, it’s not awful: I didn’t carefully benchmark it, but it sits at around 30 µsec per calculation, and there is probably a good deal of low hanging fruit such as making factorial() cache its computations, or fitting the output curves with cubics. I stuck the initial code on a wmediumd topic branch over here.

I verified the output matched my matlab code and charted it as above. Careful observers will note the 9 mbps rate is always worse than 12 mbps; this was a finding of D. Qiao and S. Choi in “Goodput enhancement of IEEE 802.11a wireless LAN via link adaptation,” in Proc. IEEE ICC01, from which most of my math was appropriated.

Simulating Wireless

A Study of 802.11 Bitrate Selection in Linux (January 2010).

I didn’t think too much of this paper when I wrote it as a term project in grad school. As an academic paper, it doesn’t really present anything novel. The equations underlying my wireless medium simulation, for example, are wholesale lifted from sources. In the few academic papers still being written on the subject, rate controllers that do not specifically look at collisions are old news (even though Minstrel tends to get loss differentiation implicitly through the magic of probability). Even at the time, looking at non-QoS 802.11 DCF and only 802.11a rates made the whole exercise a bit dated, and the world has definitely moved on in the intervening years. The paper did, however, find a few flaws (or perhaps over-exuberances) in Minstrel’s multi-rate-retry mechanism that may still be unfixed upstream, and many more flaws in PID (one I fixed upstream, but it is still not usable). I wanted to go back and redo the physical experiments before submitting patches to Minstrel, but life intervened.

However, I’ve recently been talking to the good folks at cozybit, who picked up where I left off by creating wmediumd (which does more or less the same thing but in a more polished fashion). There were still some things in my version that wmediumd lacks today, so I’m posting the paper to give it a slightly wider audience. I’d be interested to hear of any glaring flaws in the model or approach. Given time, I’d like to bring those missing features (namely, signal-level-based loss, and optional transmission time simulation) to wmediumd and repeat the experiments there.

As for the fixes to Minstrel, the basic theme is reducing the number of retries to avoid backoff, since at some point it is better to drop packets and send the next batch at a lower rate rather than retrying for tens of ms. This patch (untested) addresses one of the two points I mentioned in the paper. The other fix, to compute the backoff time per-slot, was an über-kludge in my experiments; I’ll have to see if there’s an upstreamable way to do that. Pretty much everyone (even for pre-11n devices) is using Minstrel-HT now, so it would be worthwhile to refresh and see if the issues were carried over there as well.