kernel – Insignificant Bits

October 31, 2016October 31, 2016

Encrypted mesh PSA

I’m a bit late writing this up, as lately wifi stuff has taken a back seat to the day job. But, I’ve now seen the following issue reported more than once, so hopefully this post will save the other two mesh users some grief.

If you are running an encrypted mesh on wpa_supplicant or (ye gods) authsae, take note:

Encrypted mesh on kernel 4.8 will not interoperate with kernel < 4.8
wpa_supplicant 2.6 will not interoperate with 2.5
authsae as of commit 706a2cf will not interoperate with that of commit 813ec0e [1]
wpa_supplicant 2.6 and authsae commit 706a2cf require additional configuration to work with kernel < 4.8 (see below).

What is behind all of that? A while ago I noticed and mentioned to another networking developer that the 802.11 standard calls for certain group-addressed management frames in mesh networks to be encrypted with the mesh group key (MGTK). However, the implementation in the kernel only integrity-protected these frames, using the integrity group key (IGTK) [2]. I don’t happen to know the history here: it may be that this was something that was changed between the draft and the standard, but anyway that was the state until Masashi Honma fixed it, with patch 46f6b06050b that landed in kernel 4.8.

Meanwhile, Jouni Malinen fixed a bunch of related issues in the wpa_supplicant implementation, namely that IGTK was not being generated properly and instead the MGTK was used for that. I made the same kind of fixes for authsae.

Yes, it’s a multiple(!) flag day and we suck for that.

Now, on to how to fix your mesh.

Old mesh daemon + old kernel everywhere should still work.
Old mesh daemon + new kernel everywhere should still work.
New mesh daemon + new kernel everywhere should still work.

New mesh daemon + pre-4.8 kernel: you will most likely see that peering works but no data goes across because HWMP frames are dropped at the peer. The solution here is to enable PMF in the mesh daemon:

add “pmf=2” or “ieee80211w=2” to the relevant section of wpa_supplicant.conf
add “pmf=1;” to the meshd section of authsae.cfg

If you for some reason find yourself in the position of needing to run a mix of old/new wpa_s and/or old/new kernel, here are my (completely untested) suggestions:

patch the kernel to accept PMF-protected frames as well as encrypted frames (this allows older kernels to work with newer ones)
patch the new mesh daemon (wpa_supplicant or authsae) to copy the MGTK into the IGTK instead of generating a separate key (this allows older mesh daemon to work with the newer one since older daemon will use MGTK for IGTK).

Both of these will reduce your security and don’t follow the spec, so there’s little chance such changes will go upstream.

[1] Prior to 706a2cf, authsae doesn’t even interoperate with wpa_supplicant due to not filling in all of the elements in peering frames. My advice would be to switch to wpa_supplicant, which has things like a maintainer and a test suite.
[2] On encryption vs integrity protection: the latter is similar to PGP-signing of emails: it makes tampering evident but does not hide the content of the message itself.

March 15, 2016March 15, 2016

memory Âµ-optimizing

While looking at some kmalloc statistics for my little corner of wireless, I found a new tool for the toolbox: perf kmem. To test it out, I ran a simulated connected 25-node mesh in mac80211_hwsim.

Here’s what perf kmem stat --caller had to say:

---------------------------------------------------------------------
 Callsite            | Total_alloc/Per | Total_req/Per  | Hit   | ...
---------------------------------------------------------------------
 mesh_table_alloc+3e |      1600/32    |       400/8    |    50 |
 mesh_rmc_init+23    |    204800/8192  |    102600/4104 |    25 |
 mesh_path_new.const |    315904/512   |    241864/392  |   617 |
[...]

The number of mesh nodes shows up in the ‘Hit’ column in various ways: we can pretty easily see that mesh_table_alloc is called twice per mesh device, mesh_rmc_init is called once, and mesh_path_new is called at least once for every peer (617 ~= 25 * 24). We might scan the hit column to see if there are any cases in our code that perform unexpectedly high numbers of allocations, signifying a bug or poor design.

The first two columns are also interesting because they show potentially wasteful allocations. Taking mesh_table_alloc as an example:

(gdb) l *mesh_table_alloc+0x3e
0x749fe is in mesh_table_alloc (net/mac80211/mesh_pathtbl.c:66).
61		newtbl = kmalloc(sizeof(struct mesh_table), GFP_ATOMIC);
62		if (!newtbl)
63			return NULL;
64
65		newtbl->known_gates = kzalloc(sizeof(struct hlist_head), GFP_ATOMIC);
66		if (!newtbl->known_gates) {
67			kfree(newtbl);
68			return NULL;
69		}
70		INIT_HLIST_HEAD(newtbl->known_gates);

We are allocating the size of an hlist_head, which is one pointer. This is kind of unusual: typically the list pointer is embedded directly into the structure that holds it (here, struct mesh_table). The reason this was originally done is that the table pointer itself could be switched out using RCU, and a level of indirection for the gates list ensured that the same head pointer is seen by iterators with both the old and new table. With my rhashtable rework, this indirection is no longer necessary.

To track allocations with 8-byte granularity would impose an unacceptable level of overhead, so these 8-byte allocations all get rounded up to 32 bytes (first column). Because we allocated 50 of them (25 devices, two tables each), in all we asked for 400 bytes, and actually got 1600, a waste of 1200 bytes. Forty-eight wasted bytes per device is generally nothing worth worrying about, considering we usually only have one or two devices — but as this kmalloc is no longer needed, we can just drop it and also shed a bit of error-handling code.

In the case of mesh_rmc_init, we see an opportunity for tuning. The requested allocation is the unfortunate size of 4104 bytes, just a bit larger than the page size, 4096. Allocations of a page (so-called order-0 allocations) are much easier to fulfill than the order-1 allocations that actually happen here. There are a few options here: we could remove or move the u32 in this struct that pushes it over the page size, or reduce the number of buckets in the hashtable, or switch the internal lists to hlists in order to get this to fit into 4k, and thus free up a page per device. The latter is the simplest approach, and the one I chose.

I have patches for these and a few other minor things in my local tree and will send them out soon.

January 29, 2016January 29, 2016

wherein I did a git push

As I wrote somewhere in the Metaverse, I’m running the wireless-testing tree now, which is about as minimal a contribution as one could make. It essentially means running a script once daily; cron could do it all, if not for the part where I enter my ssh passphrase to actually push out the updates. For the curious, here’s the script.

One of the killer features of git used here is git rerere, which remembers conflict resolutions and reapplies them when reencountered. A human (me) still needs to resolve the conflict once, but after that it’s pretty much on autopilot for the rest of that kernel series. Normally one should reinspect the automatic merge made by git-rerere, but at least for this testing tree, throwing caution to the wind is warranted and thus the script has some kludges around that.

Every pushed tag had a successful build test with whatever (not claimed to be exhaustive) kernel config I am using locally. I’ve considered some more extensive tests here, such as running the hostapd test suite (takes an hour with 8 VMs on my hardware) or boot testing on my mesh APs (requires a giant stack of out-of-tree patches), but, given the parentheticals, I’m keeping it simple for now.

Observant persons may have noted some missing dates in the tags. Generally, this means either nothing actually changed in the trees I’m pulling, or it is during the merge window when things are too in flux to be useful. Or maybe I forgot.

March 20, 2014

Phone bugs

If you want to experience what it is like to be the first person to ever do something, might I suggest turning on all the debug knobs on an Android vendor kernel? Some speculative fixes over here, but there is still at least one other RCU bug on boot, and this use-after-free in the usb controller driver:

[   34.520080] msm_hsic_host msm_hsic_host: remove, state 1
[   34.525329] usb usb1: USB disconnect, device number 1
[   34.529602] usb 1-1: USB disconnect, device number 2
[   34.637023] msm_hsic_host msm_hsic_host: USB bus 1 deregistered
[   34.668945] Unable to handle kernel paging request at virtual address aaaaaae6
[   34.675201] pgd = c0004000
[   34.678497] [aaaaaae6] *pgd=00000000
[   34.681762] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[   34.686737] Modules linked in: wcn36xx_msm(O) wcn36xx(O) mac80211(O) cfg80211(O) compat(O)
[   34.694976] CPU: 1    Tainted: G        W  O  (3.4.0-g4a73a1d-00005-g311eaee-dirty #2)
[   34.702972] PC is at __gpio_get_value+0x28/0x1cc
[   34.707489] LR is at do_restart+0x24/0xd8
[   34.711486] pc : []    lr : []    psr: 60000013
[   34.711486] sp : ebcd9ed8  ip : c032a858  fp : ebcd9ef4
[   34.722930] r10: 00000000  r9 : c04dd67c  r8 : 00000000
[   34.728210] r7 : c4d81f00  r6 : 6b6b6b6b  r5 : c10099cc  r4 : aaaaaaaa
[   34.734680] r3 : 09090904  r2 : c1672f80  r1 : 00000010  r0 : 6b6b6b6b
                                                                 ^^^^^^^^ whoops
[   34.741241] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[   34.748504] Control: 10c5787d  Table: acfb006a  DAC: 00000015

I guess I should try to find out where to report bugs for these (predictably) not upstream drivers, but that seems like a total pain.

July 17, 2010

Kernel updates

During baby naps I actually did some Rio Karma work this week. First, I found and fixed a recent bug in usb storage that makes the Karma not work at all in 2.6.35. Then I finally set up my omfs git tree on kernel.org, with a few patches I wrote almost two years ago, plus a recently found memory leak. I’ll try to get Linus to pull it in 2.6.36. Then, I made a new release of omfsprogs rewritten to use libomfs from the FUSE version of omfs (I also did this work two years ago). Now all is right in the Karma world again; time to ignore it for another 6 months.

Then, on to ath5k. One nagging issue that the driver has always had is that there’s pretty much a total lack of synchronization among the myriad threads. There’s an ad-hoc spin lock here and there, but for example the reset procedure, which reloads hundreds of registers, can be called concurrently with itself and other driver ops, and can be invoked while trying to transmit and receive packets. So one idea we’ve been hashing out on the ML is to serialize reset with a mutex by putting it into process context via a workqueue, and disabling tasklets while it runs. I hacked that together and it’s now available in wireless-testing. Hopefully this will fix some of those weird DMA-into-freed-skb errors that can potentially be caused by untimely modification of descriptor pointers.

Finally: tracing. For the last year or so, Linux has grown its own NIH answer to DTrace in the form of ftrace and tracepoints. The wireless subsystem already has a few tracepoints here and there, and I really like it better than debug printks because they have smaller overhead in the compiled-but-not-enabled case (a branch compared to an out-of-line function call). You can pick and choose individual tracepoints to sample, or combine them with a general tracing facility such as ftrace. I have some patches baking to add various tracepoints to ath5k [1, 2, 3].

One really neat idea that I appropriated from Johannes Berg’s work on iwlwifi is to store entire packets in the tracing ring buffer. Then you can have a trace-cmd plugin to write these out as tcpdump pcap files. This is really nice: you get a packet dump that works with your existing debug infrastructure, plus you have the tracing information so you know what the code is trying to do, at whatever granularity you chose when setting up the trace. Hopefully, this will be a good one-stop debug gathering tool, not just for developers, but for users, as well.

November 1, 2009

Ath5k AP Mode

People seem to keep asking about this, despite there being a quality page on kernel.org on how to run an AP with any mac80211 driver. So for what it’s worth, here’s how my setup works. Note if you are seeing flakiness with certain clients, e.g. it works fine with a computer but not with your cell phone, it is likely there is some bug with the power saving handling. I’m currently working on a few such issues, so it may be fixed soon enough.

To get started, you need the following:

some sort of network uplink, like a wired ethernet port
a kernel with ap mode support for ath5k (2.6.31+) and netfilter (for NAT)
dnsmasq
hostapd

You have two basic options for interfacing the wireless network with your wired network. One is by using a bridge directly to the wired network. Another is to use NAT so the wireless network is on its own subnet. The former is more typical of an embedded device, but I prefer the latter on my home LAN, so that’s what I’ll describe.

Turn off any wireless daemon (such as NetworkManager) while you experiment with hostapd to ensure that nothing else has the device open.

The hostapd.conf is large but self-explanatory. One can get away with a rather small config file if using the defaults. This example sets up an AP on wlan0 with the SSID “hostapd_ath5k”. It has a wlan-facing IP address 192.168.10.1, it’s on channel 11, supports 802.11g, and has the WPA pre-shared key “my_password”. I have also set up EAP, it works but requires making a lot of certs and such, so Google is your friend here.

hostapd.conf:

interface=wlan0
driver=nl80211
ssid=hostapd_ath5k
hw_mode=g
channel=11
auth_algs=3
own_ip_addr=192.168.10.1
wpa=1
wpa_passphrase=my_password
wpa_key_mgmt=WPA-PSK
wpa_pairwise=CCMP

For DHCP and DNS forwarding, I use dnsmasq. This couldn’t be easier. Just put something like the following in /etc/dnsmasq.conf:

   dhcp-range=192.168.10.50,192.168.10.150,12h

This will hand out addresses in the 192.168.10.X subnet. Then I use a small script to enable IP masquerading before launching hostapd (note, it will flush all iptables rules, which may not be what you want, so use with caution).

run.sh:

#!/bin/sh

DEV=wlan0
GW_DEV=eth0

# set IPs
ifconfig $DEV down
ifconfig $DEV up
ifconfig $DEV 192.168.10.1

# setup ip masq
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -F
iptables -t nat -F
iptables -A FORWARD -i $DEV -j ACCEPT
iptables -A FORWARD -o $DEV -j ACCEPT
iptables -t nat -A POSTROUTING -o $GW_DEV -j MASQUERADE

./hostapd hostapd.conf

Running the script will launch hostapd, and if all goes well, you’ll see it show up in scans from other computers.

If anything goes wrong, make sure:

you’ve started dnsmasq
you have a valid backhaul connection
you aren’t using power save mode on client (iwconfig wlan0 power off — see previous comment about PS bugs)
you haven’t found a bug

August 17, 2009

wl1251 performance

After fixing the remaining ifup bug (as expected, it was easy), I have some initial numbers on the new kernel driver versus the stock vendor driver on the G1:

driver	avg ping ms	netperf mbit/s
tiwlan	65.231	7.53
wl1251	8.565	3.82

So, better on latency, worse on throughput. wl1251 is also quite a lot larger when taking all of cfg80211/mac80211 into account, though I didn’t spend any time trying to tweak the size in the build. Well, at least the code doesn’t make you want to poke your own eyes out.

August 17, 2009

Hang fixed

These sorts of bugs can ruin your weekend, just ask Ange who had to listen to me mope yesterday. Of course, I spotted it right away when looking at it freshly during this morning’s commute. Now, ifconfig wlan0 up; ifconfig wlan0 down; ifconfig wlan0 up still fails with wl1251, but it doesn’t hang and the rest looks tractable.

August 10, 2009

wl1251_sdio merged

The SDIO patches for TI 1251 (Android wifi chipset) are finally merged into wireless-testing, so they should be a lot easier to hack on now. That means the driver should make it into 2.6.32, though at a rather experimental stage. I did fix some crashes on ifup/ifdown since last posting, but there’s always more work to do. Current todo list includes better behavior for non-polling controllers (make the irq have a top-half), tracking down a device hang on reinitialization, pushing the msm_wifi.ko module, and on and on.

But I need to spend spare cycles on ath5k in the near term. John Linville recently remarked that he was sick of seeing bug reports that say “it works fine in madwifi,” and frankly, I agree. There’s little excuse for having a sub-standard driver given that we have had two fully open HALs for almost a year. Of course, that can be laid at my feet as much as anyone’s, so my plan is to install madwifi side-by-side with ath5k and do a lot of performance testing to see where we stand. ANI is the big missing feature; it will be useful to see how madwifi performs with and without it.

In other nerdy news, yesterday I scored a copy of Kernighan and Pike’s The Practice of Programming at the local used book store for $3. I’ve read the first five chapters so far. While I’ve been at this long enough to have already learned the book’s best practices (some the hard way), I really wish it was required reading at many of the places I’ve worked. You could do away with a lot of stupid coding standards documents by instead saying “read tpop, oh and please no studly caps.”

July 19, 2009

Now bionic

Kalle has posted a rebased version of my SDIO patches on top of the latest wireless-testing. I’ve done a little bit of testing with it — the driver starts up and loads fine, but once the interface goes down, it’s busted:

<6>[ 980.884302] wl1251_sdio mmc0:0001:1: firmware: requesting wl1251-fw.bin <6>[ 981.036448] wl1251_sdio mmc0:0001:1: firmware: requesting wl1251-nvs.bin <3>[ 981.045449] init: untracked pid 354 exited <3>[ 981.074408] init: untracked pid 355 exited <7>[ 981.331853] wl1251: 151 tx blocks at 0x3b788, 35 rx blocks at 0x3a780 <7>[ 981.339300] wl1251: firmware booted (Rev 4.0.4.3.5) <7>[ 984.702740] wlan0: direct probe to AP 00:1a:70:da:a9:cd (try 1) <7>[ 984.901621] wlan0: direct probe to AP 00:1a:70:da:a9:cd (try 2) <7>[ 985.101583] wlan0: direct probe to AP 00:1a:70:da:a9:cd (try 3) <7>[ 985.301617] wlan0: direct probe to AP 00:1a:70:da:a9:cd timed out <7>[ 1024.802837] wl1251: down <3>[ 1044.856640] init: untracked pid 358 exited <3>[ 1050.033886] mmc0: Data timeout <3>[ 1050.037701] wl1251: ERROR sdio write failed (-110) <3>[ 1051.045674] mmc0: Data timeout <3>[ 1051.049519] wl1251: ERROR sdio write failed (-110) <3>[ 1052.057673] mmc0: Data timeout <3>[ 1052.061518] wl1251: ERROR sdio write failed (-110)

Exploring this further turned into a rather long yak shaving session because I wanted to get iw on the phone to do more manual tests, and I wanted to link it dynamically against bionic, the Android’s fork of a BSD libc. Here were a few interesting discoveries found along the way:

$ ./iw iw: not found

Right away, I intuited this was a problem finding libnl.so, but I was setting LD_LIBRARY_PATH. So I looked at the bionic linker source to find:

/* TODO: Need to add support for initializing the so search path with * LD_LIBRARY_PATH env variable for non-setuid programs. */

So libnl.so had to go into the hard-coded library search path of /system/lib. That alone didn’t help, however. I eventually remembered that dynamically linked executables include the name of the dynamic linker in ELF headers, and on Android this is called “/system/bin/linker” instead of “/usr/lib/ld.so.1.” After fixing that (-Wl,-dynamic-linker,/system/bin/linker), the program just SEGVed at startup, so I added Android linker scripts and other random crap to the linker command line (much of it furnished by the very cool agcc wrapper script).

# ./iw Usage: ./iw [options] command Options: --debug enable netlink debugging --version show version (0.9.14-2-g5286851) Commands: help

Almost there, but no available commands? A glance at the iw source revealed that iw sticks all that stuff into an ELF section which mostly disappears when you link with –gc-sections. So with that exorcised from my linker command line, I finally have a functioning Android iw that is dynamically linked against bionic libc and my own cross-compiled libnl.

One wonders if the pain of bionic is worth its benefits, unless a benefit was curing ennui: “I’m bored, let’s write our own libc!” Anyway, I should be able to produce the various wireless tools natively compiled for Android in short order now. After that I’ll pop my Android TODO stack to the task, “make it easier to build custom images in my build environment.” I’d like to customize the filesystem images to have all of these utilities and relevant drivers in the normal places rather than hanging out in weird locations on the sdcard.