Fake SNRs

With the addition of per-link signal levels that I added over the weekend, my wmediumd fork leveled up from “mere curiosity” to “potentially useful someday.” For the case of mesh, this means you can use signal levels to inform how HWMP will create the mesh paths.

For example, as a test I was able to validate this fix, by setting up a virtual 4-node mesh with a bad path and a good path. With the patch reverted, the bad path was almost always selected due to its PREQs being received at the target first. [In actuality, this test will exhibit frequent path swapping because the order in which the PREQs are received is essentially random, a finding in “Experimental evaluation of two open source solutions for wireless mesh routing at layer two” by Garroppo et al. Wmediumd doesn’t show this yet because frames are mostly received and queued in-order. At the time of the patch, I validated it in an actual 15-node mesh.]

There are still a couple of things that would be nice to have here. Today, we base the decision on whether a multicast frame is received by the signal level from the transmitter to us and the multicast rate. However, this means that with a low multicast rate, there is basically zero frame loss. In real life, loss happens much more frequently, and so we cannot test the effects of lost path request frames in wmediumd, which is the subject of at least one pending HWMP patch. Another problem is that the current setup works only with static setups; we might be interested in what happens with mobile nodes, for example. For that we’d need to be able to change the signal level periodically; how to easily specify that is a bit of a question mark.

tmux + wmediumd

Wmediumd gained the ability to do a simple contention simulation a while ago. It turned out to be a small change to the existing code: just ensure that any new frames are scheduled after any other queued frames of equal or higher priority from any other station.

Assuming the simulation is accurate, we might use this to gather some information about different kinds of wireless network topologies. For example: what is the throughput and latency like for a mesh network, as a function of hops?

The one sticking point is that it’s a bit of a pain to set up a bunch of mesh nodes with hwsim with their own IPs and routing tables. I’ve previously scripted this with send-to-self routing, but it’s a bit ugly. So I looked into doing this with network namespaces and controlling it all with tmux. The result is this fairly minimal script to launch a number of mesh nodes in a linear toplogy. From there one can easily run ping and iperf to gather some data, as in this chart:

This image shows the result, and is in line with measurements that Javier Cardona had done on actual hardware. We can see that throughput is roughly inversely proportional to the number of nodes, while latency is directly proportional.

This may seem pretty bad at first, but makes sense when you consider that a radio transceiver can only listen or talk at once — it is all about radio physics, nothing to do with mesh specifically (which is not to say that mesh has no inefficiencies). Also this level of performance is when all the nodes are in range of each other; in such a case you’d be unlikely to have so many hops because the nodes would instead just peer directly with each other. So we might design our networks to avoid many hops, reduce the number of nodes in a given interference area, use fancy phy algorithms to enhance spatial reuse, or use multiple channels.

My plan with wmediumd is to use it in a bit more automated fashion to evaluate things like changes to HWMP — I think if we can identify topologies that people care about then it’s a bit stronger to say “this change always makes things better” if we can show repeatable before-and-after results from wmediumd.

functional bitrate sim

My wmediumd rewrite is a bit further along thanks to getting a few hours to hack on it this weekend. It can now accurately simulate throughput between a pair of radios using legacy rates. For example, if we set the SNR between two devices to 20 dB, then they can communicate at a nominal 54 mbps rate, yielding about 26 Mbps achieved in iperf:

[  3]  0.0-10.0 sec  31.2 MBytes  26.1 Mbits/sec

At 15 dB, we can send between 24 and 36 Mbps nominal rates, which yields:

[  3]  0.0-10.1 sec  21.0 MBytes  17.5 Mbits/sec

Note that achieved throughput is quite a bit lower than nominal, as in real life — if aggregation were implemented then they would be closer.

The basic architecture is pretty simple: frames are queued on a per-sender management or data queue depending on type, and delivery time is computed based on whether or not there is loss and the contention window parameters of the queues. A timerfd is used to schedule reporting of frame delivery back to the kernel at appropriate times. The delivery time does not take into account actual contention, although this could be done in principle by looking at all the queued frames for all stations.

I haven’t really decided what to do about configuration. I stripped out the jamming and probability matrix configurations, as I feel like doing things on a signal level basis are simpler. But at this point there’s no real way to specify signal levels either (other than hardcoding), and some scenarios probably want something dynamic (e.g. mobile stations).

Changes are in my wmediumd master branch. Unfortunately, I won’t have much time to work on this for the next two months, but patches for the many TODOs are welcome.

wmediumd speed test

Thanks to some inquries on linux-wireless, I took a look at wmediumd recently. The code could use a bit of work, and there are some features I’ve been meaning to add since forever, so I started gutting it with an eye towards sprucing up the architecture and feature set (changes can be found here).

One of the questions from the mailing list was whether wmediumd adds a lot of overhead compared to mac80211_hwsim. It is of course doing more work, with additional memory copies, context switches, etc — but is it enough to make wmediumd unworkable?

So I did a quick TCP iperf test on my laptop with an open mesh, and get the following numbers.

hwsim without wmediumd:

    [  3]  0.0-10.0 sec  1.36 GBytes  1.16 Gbits/sec

hwsim with wmediumd:

    [  3]  0.0-10.0 sec  1.27 GBytes  1.09 Gbits/sec

It looks like wmediumd is doing fine. This is with monitors running, the non-monitor case does about twice that. Actually, I think this is a bit lower than it should be, but considering both cases are close, and a good deal faster than your typical wifi connection, it’s probably good enough for some level of bandwidth simulation.