On S-Expressions

I suppose I’m old enough now, and the rate of change in computing is fast enough, that I can regale young software developers with I-remember-when stories. You know, the ones that start off like “all of my punchcards were rubber-banded together, but I had neglected to number them…” I’ll spare you the tale of how we used to screen-scrape the mainframes at LargeAirline.com for today; instead you get to read my ramblings on languages-older-than-Go.

Today’s topic is inspired by my reading on LWN about the fact that Emacs is still around, which led me to David Wheeler’s argument that parentheses are why no one uses Lisp and that we can do better. Perhaps so, but I found his plugging of SKILL interesting, since once, long ago, when I was but a fresh young hacker, I spent a rather significant bit of time writing software in this language.

Since you have likely never heard of it, SKILL is a Lisp implementation that is (or was?) embedded in the Cadence suite of Electronic Design Automation products. As I recall, the biggest warts were:

  • the fact that you could use C-like prefixing — f(x y) was equivalent to (f x y) — but other constructs, like using parentheses for grouping, behaved as in Lisp (i.e. errored).
  • the runtime library used C-like functions, such as printf, fread, etc
  • the developer’s manual encouraged using imperative style programming like loops; indeed using recursion instead would usually lead to stack overflows

I typically wrote in the Lisp style and just ignored the C-like features as much as possible. The experience soured me on functional languages forever, which isn’t really fair, because SKILL was particularly bad.

But I’ve often wondered why it was so horrible, and how it came to be that Cadence, a large successful company, selected it as its primary extension language. Of course, almost all EDA software is horrible so the bar is pretty low, but I wondered if this was a case of Cadence not realizing how bad it was, or if it was just historical baggage. Or did a bored programmer just decide one day, “hey, let’s write our own crappy Lisp!” (cf. much of Android).

Luckily, there’s a paper on it to shed some light: “SKILL: A Lisp-based extension language” by E. Petrus, 1993. I snagged a version from behind the ACM paywall, so I won’t post it, but I’ll quote a few interesting bits here.

SKILL was originally conceived as a base language for developing simulation and test description languages… Lisp was chosen as a base language because it lent itself well to an interpreter based implementation and Lisp is a natural choice for code generation and as a meta language. The first implementation was an interpreter based on the Franz Lisp developed at Cadence Design Systems in the mid-1980s.

Fair enough. Mostly historical baggage, partly not-invented-here.

The popular syntax for SKILL is a C-like alternative to Lisp. When SKILL was first introduced, the connection with Lisp was not promoted in electronic engineering circles.

Too bad. Maybe if the same engineers had compared it to an allegedly decent Lisp, they might have demanded better.

Because of the large installed SKILL base, it has been difficult to move SKILL [in the direction of Common Lisp].

Oops.

For the most part, SKILL developers come from a EE background and lack general familiarity with functional languages. As a result, company wide code inspections revealed the need for a high level code analysis tool.

Even Cadence internal staff couldn’t use SKILL.

The paper goes on to describe why extension languages are useful as a component of software-reuse. While I won’t disagree, I find the characterizations of EDA software more an indictment on the state of software development in the industry circa mid-90s. The products I used from Mentor Graphics, Cadence, and so on were just horrendous. Each application was typically a 40 meg, crashy, statically linked executable, in which a lot of functionality was shared between a few other executables that mostly did the same thing but had subtly different missing features (probably in the name of their byzantine licensing strategies).

  • ECAD software is too large and diverse to be contained in a single executable
  • Need “fire-walls” between products. Separating major products into separate executables reduces the chance of one set of code from corrupting data in another.
  • Change control is easier when the software is built separately.
  • Independent release of products. Products linked together in the same executable must be shipped together. In separate executables, products can be shipped whenever necessary.

(Shared libraries anyone?)

I think I wouldn’t mind SKILL if it had been a decent Lisp, but unfortunately Cadence got stuck with maintaining their own one-off fork and the software ecosystem that grew up around it. At the EDA firm at which I worked, we were lucky to be a late-comer, which allowed us to use a reasonable extension language: perl. I haven’t kept up with the industry, but I hope others have followed suit.

Pork

Inspired by the ‘tinga’ (Mexican roast pork tostadas) recipe in this month’s Cook’s Illustrated, I bought a $12 picnic shoulder at the grocery store on our last trip out. The recipe recommends a boneless boston butt instead, which is probably a good idea given all of the tendons in the lower cut. But my grocer only had the shoulder and it’s cheap so what the heck.

I spent last Saturday morning carving all of the meat off of the bone, at which point I realized just how much pig we’re going to be eating for the next few weeks. As Ange and I try to subscribe to the ‘use everything but the squeal’ philosophy, I portioned the slab for various future meals: two pounds of meat for the aforementioned tinga, a couple of pounds cut into thin slabs for char siu, about another pound or so of trimmings for barbecue or pork tacos, and the bones went into the freezer for congee.

Which left a big hunk of skin. I tried making cracklings out of this, but the porcine gods were not having it: it was a big sticky mess. I ultimately gave up after a one sizzling piece of skin and lard hopped out of the pan only to land on my face about a centimeter away from my right eye. Even fried pig skin isn’t worth blindness.

The tostadas were pretty awesome though.

Search Trees Revisted

I have been investigating search trees to see if binary search trees’ memory layout can, in practice, make a large difference in performance. Now that I’ve fixed the bugs in my test application, I can say: yes! The theory here is in so-called “cache-oblivious” algorithms. The idea is that you can use divide-and-conquer algorithms and data structures to exploit locality and ensure that at some point you will be working with the granularity of your memory transfer size (such as the size of a cache line or disk block).

How well does it work? Take a look:

This machine has a 2 meg L2 cache and 32 K L1 d-cache. The size of nodes in this test was 32 bytes, so the L1 cache is blown at 2**10 items, and the L2 cache at 2**16.

With the exception of “base,” the trees are all laid out in an array; “vEB” is the cache-oblivious version in which the tree uses a van Emde Boas layout: the top half of the tree is stored in the array, followed by each of the subtrees in left-to-right order, recursively. Unfortunately, constructing the tree dynamically is rather complicated.

So far I’m just exploring ground that’s already been covered, but I do think this is an interesting class of data structures that hasn’t trickled out into the real world yet.

A tale of two search trees

As a warm up for a research topic I’m doing this semester, I am looking at static search tree performance. Consider the following tree fragment where nodes are numbered according to BFS order:

            1
      2            3
   4     5      6      7
  8 9  10 11  12 13  14 15

The nodes are located in memory wherever malloc() put them. An interesting question is whether a different in-memory layout has better search performance. One possible layout is in an array with the following order: 1,2,3,4,8,9,5,10,11,6,12,13,7,14,15 (this pattern is not original to me).

So far, the answer, surprisingly to me, is: not with this layout. With a benchmark of 10 million searches of a randomly-inserted (otherwise unbalanced) tree, I get:

Original Tree: 77s
Reordered Tree: 87s

The tree searches are exactly the same — only the memory layout changed. I can reduce these numbers quite a bit by using -O2 and reducing the size of tree nodes, but there’s still a consistent net loss to the array ordering.

I suspect my cache-emptying routines don’t, but if not, it will be interesting to see where I made the stupid mistake / mental lapse.

Hummus

HummusAnge and I have been on a real Hummus kick lately for some reason. A container of Sabra is usually gone within a few minutes of being opened around here. So I decided to save a few trips to the store and try making it at home. Turns out, this is really easy: dump a can of chick peas, 1/4 c each of olive oil, water, and tahini (finding this is the hard part), a garlic clove and 1-2 tbsp of vinegar into the food processor, and press “On.” A little time spent chilling in the fridge and it’s good to go. A couple of roasted habaneros wouldn’t be a bad addition, either.

I also made my own pita chips from pita bread. While tasty and a lot cheaper than pre-made chips, I think that is far too much work when you’re hungry.

Android rebuilt

While stuck inside for days due to snow, I once again downloaded the Android source trees from the repos. I plan to update my Android wifi page with better, step-by-step instructions to go from nothing to a working set of wireless utilities on the phone. In the meantime, I’ve re-familiarized myself with the current Android build environment (for various values of current — this uses donut since as far as I know, eclair doesn’t have a definition for the Dream (G1/ADP1) phone yet).

I do have to say that things in this area have improved a lot. While it still took some google-fu to figure out which branch to check out to get the correct software (tag donut-plus-aosp), buildling the entire image from scratch was rather straightforward (. build/envsetup.sh; lunch aosp_dream_us-eng; make). My main complaint is that the build system is far too slow to use for day-to-day work. Is Google still not eating their dogfood here?

So far I have libnl, iw, and packetspammer now integrated properly via Android.mk makefiles, and a custom kernel plus the support modules for wl1251. Unfortunately, space is really tight on the system partition, so the TI driver had to go to make room for everything. I’m still waiting for someone to port the Android runtime to emdebian or maemo with /usr on the sdcard, but perhaps I’ll mess around with bind mounts until then.

It seems wl1251 SDIO doesn’t like the new power-saving code, so that is something else I’ll look into soon. For now, one can disable that in the driver or possibly via “iwconfig power off”.

wl1251: cmd set ps mode
wl1251: cmd configure
mmc0: Data timeout
wl1251: ERROR sdio write failed (-110)
...
wl1251: ERROR elp wakeup timeout

Storage Menagerie

I recently found myself needing to become familiar with the disk layouts of a few filesystems for a future research project. At the same time, I wanted to experience the fail whale that is github.com (my other git trees are hosted on repo.or.cz, which is spare and of tenuous funding, but works just fine). So over the weekend I threw together this little git tree.

Not much there, just ext2 and yaffs2 so far. I chose the latter because I wanted to look at a log-structured filesystem, while simultaneously producing something useful for Android work. I plan to add btrfs shortly and maybe some other log FS. This is just junk code: if you really want to use these filesystems in FUSE, there are much better implementations. Also there’s a great hack out there to run the kernel FS implementations in userspace via UML.

Still, it’s fun reinventing the wheel, and it’s a nice way to get a handle on this stuff.

New GPG Key

After 10 years, I finally received an encrypted email from the ether. Which reminded me that:

  1. The key goes to bobc@ieee.org, which I no longer own
  2. There are probably numerous security holes with the old versions of gpg and/or Debian openssl used to generate the keypair
  3. The key is over 10 years old so NSA have cracked it long ago

Thus, it has been revoked. Let it be widely disseminated that my new key fingerprint is:

037A AD54 51B3 09AD 2362 93EF 7154 040D C976 F35E

19110

Welcome belatedly to the tens. I suppose I should summarize the previous year as is my typical MO in January.

We kicked off 2k9 freezing our appendages watching the inauguration in DC. Still worth it.

Angeline settled in to her new private practice, wowing patients and colleagues alike with her acumen and meticulousness. At the same time, in a series of poor decisions, I gained a renewed respect for Clark Kent jobs as I went part time with mine to attend grad school.

We cheered the Caps on in another playoff run from our seats in aisle 117, row G. I took ice skating lessons. My cred as honorary Canadian waxes.

We spent some time in Germany, increasing our vocabulary slightly beyond words learned from Castle Wolfenstein.

I’m close to making good pizza at home. Ange is mastering Cirque du Soleil yoga. She has also picked up lots of trivia on hockey great Bobby Orr, and learned the true meaning of “imaret” through daily application to the NYT crossword puzzles.

My goal of 100 kernel patches in 2009 was reached (123). This year, my goal is to be a better maintainer and stop sitting on patches, even while having much less time to contribute due to school.

’10 promises to be a big year as we continue that whole nucleotide ordering experiment.

JBoss Hash

Here’s how to figure out the hashes of method names used by JBoss’ RMI invoker from the shell. The first two numbers in the method signature form the length of the string. Maybe there’s a better way to do the backrefs in sed, but you do need to swap them around.

$ echo $[`printf "0031ping(Ljava/lang/String;)V" | 
    sha1sum | 
    sed -e "s/(..)(..)(..)(..)(..)(..)(..)(..).*/0x87654321/g"`]

Don’t ask me why I figured that out.