Upstream Last

Building a booting kernel for Android using their own code still eludes me. I guess it would help if I knew anything at all about the ARM platform, as I have no idea what I am doing. Still, the code dump fails the “anyone should be able to build it” test rather strongly, judging from posts on the android mailing lists.

Build problems aside, the more egregious failure of the Android open source project is the lack of the “upstream first” mentality. Most Linux software providers follow this strategy to a large extent, which is simply: any patch that goes in my distribution should go to the mainline kernel first. The benefits are obvious: no need to maintain a fork forever, and everyone gets the goods as soon as possible. Even if the patch only serves the developer’s own interests, upstreaming it often has the benefit of vetting the interfaces and producing more universal APIs (the wakelocks discussion that followed the code dump is an example of this done backwards).

Google, however, sadly took the other road: get the product out the door, then worry about pushing bits upstream when we get a chance. The problem is “when we get a chance” never, ever happens, and there’s no motivation to push changes upstream when your product has shipped and it’s now forever in deep maintenance mode. That job of upstreaming is left to the community.

So far there’s no real indication (apart from the userland stuff) that you actually can build the G1 kernel and userland from the code dump. Google engineers are unclear whether everything on the phone is in the codebase (clearly some proprietary bits are not). Meanwhile they have their own fork of qemu, their own fork of the kernel, their own fork of webkit, and so on.

I suppose we should not fault Google too hard: after all, some code is better than no code at all. And if they want to maintain forks of .* forever, that is clearly their prerogative. However, had they worked more closely with the community at the start, the Android platform would be stronger today. They might already have a decent wireless driver, for example.

Fit to be TIed

The Android G1 phone uses the TI 1251 for Wifi.   This is pretty useful for when you need an internet connection for the laptop and there are no alternatives: you can bridge the wifi and the 3G connection and then connect to the wifi device using ad-hoc mode.  However, I normally keep the wireless off because it sucks down battery like mad.

But Kalle Valo from Nokia is working on a driver for the chip called wl12xx, and he is putting a lot of effort into making sure it uses power conservatively (presumably some future version of the Nokia internet tablets will be using this device).  This driver is SPI, whereas the G1 uses the same chip in SDIO mode.  So I started writing the SDIO port for the driver — hopefully it will be ready when wl12xx is included in mainline.

The TI reference code that actually runs on the G1 is, sorry to say, total crap.  It registers an interrupt handler that printk()s a static string.  It has unbalanced sdio_claim_host() calls.  And from a code style standpoint, well, there is none.  I suspect Valo’s driver will be a big step up if and when it can run on the Android.

Oh, and Google’s open source tree for the Android hasn’t built all weekend.  Great job!

Serial offender

There comes a time in every budding kernel developer’s life that he has to debug a mysterious lockup, and nothing will do but a serial console. Well, for my future recollections, here’s how to set it up:

  1. Get out your handy pl2303-based usb to serial adapter, because chances are good your laptop doesn’t have a serial port
  2. Build your kernel with CONFIG_USB_SERIAL_CONSOLE=y
  3. Add to your kernel command line: console=ttyUSB0,115200 console=tty0
  4. Hook your computer up to the other computer via a null-modem cable (man, these are pricey these days, $30 for something no one still uses?)
  5. Set up minicom to use your serial port, say ttyS0, at 115200 baud, 8N1, and turn off all the modem init strings
  6. Don’t bother futzing with getty, you only need it if you want to also allow logins over serial. For logging, it’s unnecessary


Now, start minicom on computer 2 and reboot your computer under test. If all goes well, you’ll capture a panic on the serial console. If all goes poorly (my case), you’ll have a lockup with no oops. The usual thing to try in this case is adding “nmi_watchdog=1” to the command line, which will use the non-maskable interrupt to break into any frozen code. Also, if you have CONFIG_DETECT_SOFTLOCKUP set, hopefully after 60 seconds or so you’ll get a soft lockup warning.

In my case, I still have a hard lock with no output. Ho hum.

Hacking, the good kind

I could write about the election here, but citizen905 already summed it up pretty well. So instead, here’s what I’ve been breaking in the Linux kernel lately:

  • My final patch count for 2.6.27 was 14, I think. Enough, anyway, that I can stop counting and just deal with all the work I’ve created for myself.
  • I added myself to MAINTAINERS for ath5k, which felt like a pretty ridiculous notoriety grab, but Nick asked me to do so twice, so there.
  • I have some fixes for ath5k for 2.6.28, nothing major but an oops should be fixed, and a WARN_ON removed. The oops fix, incidentally, had an obvious bug despite 3 sign-offs. I suck.
  • Also committed but to-be-reverted for suckiness is a patch to remove beaconing in STA mode. Turns out ath9k, from which I stole this idea, was just busted. The new plan is to use the beacon miss interrupt; until then, your wireless card has to wake up the CPU about 100 times a second.
  • For 2.6.29, I have added hardware encryption to ath5k and hopefully will get some time to hack on the suspend/resume support for mac80211. Then I have some omfs patches I’ve been sitting on for months.

SYSRQ on MacBook

Lately I’ve really needed SysRq in situations where /proc/sysrq-trigger just doesn’t do the job, and my MacBook is missing lots of crusty old XT-era keys. Finally, I know how to do this!

/* includes and error handling omitted for brevity... */
#define USAGE_CODE 0x070044 /* USB hid for F11 */

int main() 
{
int codes[2];
int fd = open("/dev/input/by-id/usb-Apple_Computer_Apple_"
"Internal_Keyboard_._Trackpad-event-kbd", O_NONBLOCK);

codes[0] = USAGE_CODE;
codes[1] = KEY_SYSRQ;  /* from linux/input.h */
ioctl(fd, EVIOCSKEYCODE, codes);
}

Awesome. Supposedly, a tool called keyfuzz is also efficacious.

OSS, I has it

I just sat in on a conference call as a representative (by default, since no one else called in) of the Linux ath5k community, with Atheros, makers of my MacBook’s wireless ethernet card. Atheros have really done a 180 for supporting the community, first by releasing ath9k, then by releasing the source to their previously-closed HAL last week. Thanks to that, 6 patches have already gone out fixing various problems. BTW, conference calls are just as pointless in the OSS community as they are in real life. But at least I did learn that it is pronounced “uh-THERE-ose”, not “ATH-er-ose.”

Buy laptops with Atheros wireless cards!

Oops

I am finally getting the hang of debugging kernel crashes. None too soon as I got my first OOPS report from the -rc kernel with OMFS, from a gentleman who is intentionally corrupting his FS (“fuzzing” in the infosec lingo). After a frustrating weekend in which I had inadvertantly fixed the bug but didn’t realize it because I was testing the wrong module, I can now claim success. One down, several more to go.

Detective work after the jump if you care for the nerdy stuff.
Oops report:


BUG: unable to handle kernel paging request at c978e004
IP: [(c032298e)] omfs_readdir+0x18e/0x32f
Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
[...]
EIP: 0060:[(c032298e)] EFLAGS: 00010287 CPU: 0
EIP is at omfs_readdir+0x18e/0x32f
EAX: c978d000 EBX: 00000000 ECX: cbfcfaf8 EDX: cb2cf100
ESI: 00001000 EDI: 00000800 EBP: cb2d3f68 ESP: cb2d3f0c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[...]
[(c018a820)] ? filldir64+0x0/0xcd
[(c018a9f2)] ? vfs_readdir+0x56/0x82
[(c018a820)] ? filldir64+0x0/0xcd
[(c018aa7c)] ? sys_getdents64+0x5e/0xa0
[(c01038bd)] ? sysenter_do_call+0x12/0x31
=======================
Code: 00 89 f0 89 f3 0f ac f8 14 81 e3 ff ff 0f 00 48 8d
14 c5 b8 01 00 00 89 45 cc 89 55 f0 e9 8c 01 00 00 8b 4d c8 8b 75 f0 8b
41 18 (8b) 54 30 04 8b 04 30 31 f6 89 5d dc 89 d1 8b 55 b8 0f c8 0f c9

First step is to look at the faulting instruction. Running the “Code:” part through ~/linux/scripts/decodecode yields the disassembly:


8b 4d c8             	mov    -0x38(%ebp),%ecx
8b 75 f0             	mov    -0x10(%ebp),%esi
8b 41 18             	mov    0x18(%ecx),%eax
8b 54 30 04          	mov    0x4(%eax,%esi,1),%edx <=== here
8b 04 30             	mov    (%eax,%esi,1),%eax
31 f6                	xor    %esi,%esi

So the instruction is dereferencing the address [(eax+esi)*1+4]. From the register dump, EAX=c978d000. That looks like a pointer. ESI is 00001000, which is probably the index to an array. 0x1000 happens to be PAGE_SIZE which explains the page fault (kernel paging request) at the top of the oops.

Next, let’s look at the C code. There are two ways:


$ gdb omfs.ko
(gdb) l *(omfs_readdir+0x18e)

Or (and I find this a little more obvious since it has mixed C and assembly):


$ objdump -S omfs.ko > foo.S
# now look for instruction opcodes in foo.S: "8b 54 30 04"

From the output of the above commands, it’s apparent that the +4 index in the instruction comes from be64_to_cpu() converting a 64-bit big-endian number to little-endian. And we do that when reading directory pointers in omfs_readdir, specifically:


fsblock = be64_to_cpu(*((__be64 *) &bh->b_data[offset]));

EAX is bh->b_data so ESI must be offset. I happen to know it should never be above 2048, but it is 4096 in the register dump. Since the range is ultimately controlled by the directory inode size, I immediately suspected that that size got corrupted. For some reason I chased a bunch of other dead ends until I finally did look at the disk image and saw that the directory size was all wrong. Rule one of debugging: go with your gut.

Oh well. I guess all that assembly coding from years ago was useful after all.

Meh

I’m playing with date conversions today, and again I’m struck by how much the Java Calendar should be held up as an example of the over-engineered API. Has anyone ever used anything besides the Gregorian calendar? They were so proud of it when it hit 1.1.

I should have two patches hitting kernel 2.6.26, one entirely cosmetic and one that fixes a real bug on Atheros wireless cards. Akpm did pick up the OMFS patchset so hopefully that will go in .27 timeframe, though the jury is still out on whether it hits mainline.

In other news, take that, Skype!

mmio trace for fun and no-profit

I’m not sure what got me interested in assembly language as a nerdy high-schooler. It could have been the growing interest in computer graphics, at a time when you had to use assembly to get decent performance out of the machine. I remember learning tricks from much smarter people than myself, such as how to set a 256×256 pixel video mode so that you could address any pixel without ever needing to multiply. Not that you would use a multiply anyway since you can always use shifts and adds.

I suspect, however, that part of what spurred me on was an interest in reverse-engineering, specifically to crack games. You would start BattleChess, say, and it would ask you for some move from some historical chess game in a big book before letting you play. This was to ensure that if you copied the game, that you at least also photocopied your friend’s manual. [For the record, while I cracked a few games, I never released any such cracks. More from being l4m3 than any teenage sense of ethics. I did, however, release instructions for patching the video game “Home Alone” so that you could never die. I may have been the only person on the planet to play that one all the way through.]

In order to crack a game, first you would load up SoftICE, a killer software debugger that would let you debug almost anything. It would let you set breakpoints on interrupts so step one was to put a breakpoint on ‘int 10h’. Interrupt 0x10 was a call into the video BIOS for setting up the video card. Back in the DOS days, it was the first thing every graphical program (and therefore game) did, because you started out in text mode and had to go into VGA mode. The BIOS would know how to load all the registers for that particular card; you just had to say “put me in 320×200 mode, now, thanks!” and that was via the assembly command “int 10h”.

After SoftICE caught the interrupt, the screen would switch back from VGA mode into a text mode listing of assembly instructions and raw machine code hex values. I don’t think I knew what all these meant at first, but I understood ‘call’ and ‘jmp’, and everything else I quickly learned from a text file describing the x86 instruction set architecture. So cracking then became a matter of just single stepping through all of these instructions, and waiting for BattleChess to get to the part where it asks you the question. Then, you start paying attention, looking for the assembly equivalent of “Is the answer right? If so, goto game! Else goto nasty message!” Some of these were harder than others, but generally it looked something like:

mov ax, ds:[43ac]
mov cx, ds:[3401]
cmp ax, cx
jnz 0027

You could step through it, type the wrong answer, then watch it jump to instruction number 27 which then printed out the nasty message. Then you could try again, and this time modify the code while you’re in the debugger. For example, change the jnz 0027 to a few no-ops (instructions that do nothing). Type in the wrong answer, and now it keeps on going to the game. Bingo, the game is cracked! SoftICE was such a nice debugger that I even used it for C debugging until I finally moved off of that whole DOS thing.

Anyway, I’ll never be a Jon Johansen, but that interest in reversing stuff has stuck with me. I think that figuring out how the Karma worked, which involved many hours of pouring over hex dumps trying to come up with the pattern, was much more fun than coding the driver. Particularly when the “ah-hah!” moments would strike. Incidentally, coding the driver was/is still fun, probably more so than actually using the device.

This little walk down memory lane was inspired by the recent entrance of mmio-trace into my zone of consciousness. I’m not sure who used this first, but the Nouveau project has been using the utility for some time to reverse-engineer the NVidia video cards so that Linux can gain decent open source drivers. My laptop luckily has an Intel based video card, but it does have an Atheros wireless chip which also has a binary blob. There’s currently an effort to produce a reverse-engineered driver for it as well. In my case, the new driver almost worked, but I needed to come up with some other details, so I tried out mmio-trace.

PCI cards are configured by register writes, but the driver generally has to program it to get it going. The BIOS can’t do that work for us any more. So the OS maps a region of memory that the driver can just write into, which then all gets converted magically into writes into the PCI device’s register file. Thus, MMIO=”Memory Mapped IO”, and mmio-trace does what it sounds like. Using it is rather painless, you first ‘hook’ a module you want to capture, modprobe mmiotrace, then load the hooked module. All writes to PCI config space get captured into a debugfs file that you then run through a user-space filter. The end result is a nice report of all of the reads and writes to the registers of the PCI device. Very neat!

Oh, the driver maintainer already knew about the issue and confirmed my proposed change so hopefully by 2.6.26 ath5k will support my Macbook.