Despite earning my degree in computer engineering, I haven’t done anything useful with assembly language since I was a strapping young idealistic lad convinced that compilers lie along the road to inefficiency. Much has changed. Heck, I write Java code for a living now — pretty much the opposite of efficient. I have to have a gig of ram just to run that sucky ant program.
While hacking my MP3 player, I discovered that the filesystem uses hashing to quickly lookup file names, which brought up the question of which hash function it uses. While I suppose one could reverse the hash function knowing a very large set of inputs and outputs, I decided it would probably be much more expedient to just put my atrophied x86 asm knowledge to work for me.
This turned out to be a lot easier than I thought. It only took about 20 minutes and I never had to step through code in a debugger.
Step 1: Disassemble Windows program for loading files onto the device, including the data segment. Look for the offset of a useful printf format string (“hash is %d, expected %d”).
Step 2: Search disassembly for loading said offset in a call to printf. Not surprisingly, this is right after the computation of the hash.
Step 3: Examine nearby calls for things like shifts and mods (common hashing operations).
Step 4: Relearn the stupidities of the x86 ISA (ecx is a loop counter, eax and edx figure in mysteriously for divides, etc).
Step 5: Convinced that a nearby call is it, reimplement in C and test.
Booyeah.
Re: YOU ARE IN VIOLATION OF THE DMCA
Heh. I look forward to EFF turning down my case.