hashed

Despite earning my degree in computer engineering, I haven’t done anything useful with assembly language since I was a strapping young idealistic lad convinced that compilers lie along the road to inefficiency. Much has changed. Heck, I write Java code for a living now — pretty much the opposite of efficient. I have to have a gig of ram just to run that sucky ant program.

While hacking my MP3 player, I discovered that the filesystem uses hashing to quickly lookup file names, which brought up the question of which hash function it uses. While I suppose one could reverse the hash function knowing a very large set of inputs and outputs, I decided it would probably be much more expedient to just put my atrophied x86 asm knowledge to work for me.

This turned out to be a lot easier than I thought. It only took about 20 minutes and I never had to step through code in a debugger.

Step 1: Disassemble Windows program for loading files onto the device, including the data segment. Look for the offset of a useful printf format string (“hash is %d, expected %d”).
Step 2: Search disassembly for loading said offset in a call to printf. Not surprisingly, this is right after the computation of the hash.
Step 3: Examine nearby calls for things like shifts and mods (common hashing operations).
Step 4: Relearn the stupidities of the x86 ISA (ecx is a loop counter, eax and edx figure in mysteriously for divides, etc).
Step 5: Convinced that a nearby call is it, reimplement in C and test.

Booyeah.

2 Replies to “hashed”

Comments are closed.