Life with Lunchhooks

To content | To menu | To search

Wednesday 13 February 2008

How Quickly We Forget

I'm fairly amused to see the claims that people make about the iPhone, in particular the claims about how limited it is as a platform (i.e., how little we can expect from it given how limited its CPU, RAM and storage capabilities are). For example, recently Craig Hockenberry wrote about how difficult the iPhone will be to develop for and said this about its RAM constraints:

There are some very tight limits on memory usage. You’re given approximately 64 MB of space to work with [...]

Is 64MB tight? When we compare an iPhone to the desktop machines of today, it's true that it looks a little pokey—my laptop has 4GB in it, and my desktop machine has even more—but that isn't the question here. The question is whether it ought to be enough for the kind of applications people will want to run on the iPhone and, in the context of Craig's article, whether ordinary developers ought to be able to write applications that run on the iPhone without breaking too much sweat, and whether the familiar and easy-to-use development tools developers have become used to can be reasonably expected to target the iPhone.

I have to defer to Craig's actual experience developing for the iPhone when it comes to describing the situation as it currently is, but there is no reason to suppose that it has to be that way. I'd argue that Cocoa and OS X have a long history, and in that history many of the same tools and libraries we're still using today targetted a much more resource-limited platform.

It's easy to be spoiled by the vast amounts of memory that desktop machines have today, but 64MB isn't peanuts. If we go back to the origins of OS X, NEXTSTEP, we find that it ran with much tighter resource constraints. The base model of the very successful NeXTstation originally had 8MB of RAM and a 105MB hard disk—yes, it actually had less disk space than the iPhone has RAM. True, to install the developer tools you probably wanted the 400MB disk option, and with only 8MB it was fairly quick to start swapping, but if you maxed the machine out—to a “whopping” 32MB of RAM—you could run quite a lot without needing to swap. It's true that NEXTSTEP could swap if it needed to, but applications that needed double the physical RAM of the machine were rare indeed.

As a quick test, I booted up my OpenSTEP 4.2 virtual machine in VMware, where the whole virtual machine only has 64MB of RAM, and started a few applications (the wonderful spreadsheet Quantrix, Lighthouse Design's clone of Lotus Improv, Diagram, which The Omni Group later cloned as OmniGraffle, and Preview opening a large PostScript file). Here is the output from ps:

openstep> ps ugxc
USER       PID  %CPU %MEM VSIZE RSIZE TT STAT  TIME COMMAND
clawpaws   184   0.0 14.1 16.8M 9.01M ?  SW    0:05 WindowServer
clawpaws   186   0.0  2.9 3.58M 1.88M ?  SW    0:00 pbs
clawpaws   189   0.0  1.1 2.56M  704K ?  SW    0:00 appkitServer
clawpaws   190   0.0  3.7 5.79M 2.34M ?  SW    0:00 WM
clawpaws   191   0.0  3.0 6.06M 1.92M ?  SW    0:00 Preferences
clawpaws   206   0.0  4.3 7.83M 2.77M ?  SW    0:00 Diagram
clawpaws   208   0.0  1.8 7.17M 1.18M p1 SW    0:00 tcsh
clawpaws   236   0.0  6.4 8.35M 4.09M ?  SW    0:00 Quantrix
clawpaws   249   0.0  1.1 2.06M  752K p1 T     0:00 ftp
clawpaws   251   0.0  6.0 6.70M 3.86M ?  SW    0:00 Preview

Perhaps you think it isn't fair to compare NeXTSTEP or OPENSTEP to what we have today in OS X, but if so you probably haven't seen or used either of them. It may be technology from more than a decade ago, but it's no Windows 95—Unix is old, too, and many of its basics haven't changed much over the years. Objective-C, Interface Builder and friends were there from day one. Today's Cocoa libraries look and feel very much like their counterparts in OPENSTEP.

Craig also writes:

Guess what? This nightmare will become a reality as soon as you start building your iPhone application. There are no NIBs. None.

I don’t think this is one of those “let’s skip it for version 1.0” design decisions. The process of unarchiving the objects in the NIB takes CPU cycles and memory: both things that are in limited supply on the phone.

I hope that you can see from the above how little water the above argument holds. The iPhone has plenty of CPU power for this task—NEXTSTEP used nibs and ran on a 25 MHz 68040, not a 400 MHz ARM.

But there is something else wrong with this argument, too, namely the idea that it is somehow cheaper to create objects programmatically rather than by decoding an object serialization representation such as a nib file. That's a time/space performance claim that needs to be substantiated with evidence. Very very often, human intuition about what is fast is wrong, because modern machines are complex beasts and things like caches and memory access behavior can make a big difference. The way to know is to run tests and see if the performance difference is actually noticable. But to provide some counter-intuition to anyone who thinks it's obvious that pure code ought to be faster, here's one: compact code interpreting a compact data representation may fit in level one cache, whereas the longhand code to do the same task may not.

So, I don't buy it. I'm not saying that the iPhone SDK will have nibs, but I am saying that I've yet to see a good reason why it couldn't.

Tuesday 21 August 2007

Unspecified Attributes

On the Internet, as they say, no one knows you're a cat; but apparently no one notices even if you drop some heavy hints. The title of my blog, Life with Lunchhooks, and the name of the domain is supposed to suggest the idea that I am a creature with paws and claws (such as a domestic cat), but I'm not really sure how many of the people who've actually read this site really get that idea.

If you were, or now are, trying to imagine me as a domestic cat, paws poised over the keys, your picture of me will nevertheless be somewhat fuzzy, since I have not given you my breed, color, national origin, age, gender, orientation or socioeconomic status. Cat or not, one thing I can tell you for sure is that if you're imagining a 20-something straight white American male college kid, you're not just barking up the wrong tree, you're on the wrong side of a different planet, barking up at a street lamp.

What you imagine doesn't matter that much so long as you keep your assumptions to yourself.  But if you let them loose, unverified, you run the risk of embarrassing both of us.  If you are the kind of person who must fill in those unspecified details (FWIW, I'm mostly not), one way to train yourself to avoid making these kinds of embarrassing faux pas is to imagine the unusual for unspecified things.  That way, you're more likely to remember that you made up that detail yourself out of whole cloth, and you'll also be living a much richer inner life.

(Of course, with the “barking up the wrong tree” metaphor, maybe I'm giving things away about how I imagine you.)

Monday 16 July 2007

Storing iPhone apps locally with data URLs

Some people think that you need net access to run web-based applications on your iPhone. Not so. The URL below provides a simple tip calculator (sadly this crappy blogging system doesn't let me do a direct link, which sucks, but you can copy and paste and add it your bookmarks on your computer then sync with your iPhone, and/or make your own page with a direct link). By using a data: URL, the entire page content is all in the URL. If save a bookmark for this URL, you can access this little JavaScript-based app even in airplane mode.

data:text/html;charset=utf-8;base64,PGh0bWw+CjxoZWFkPgo8bWV0YSBuYW1lPSJ2aWV3cG9ydCIgY29udGVudD0id2lkdGggPSAyNDAiIC8+Cjx0aXRsZT5UaXAgQ2FsY3VsYXRvcjwvdGl0bGU+Cgo8c2NyaXB0PgoKZnVuY3Rpb24gdGlwKGFtb3VudCkgewogICAgcmV0dXJuIHRpcDsKfQoKdmFyIG91dHB1dCA9IG51bGw7CnZhciBwZXJjZW50ID0gMTguNSAvIDEwMDsKdmFyIHJ0aXBfZmFjdG9yID0gMC4yNTsKdmFyIHJ0b3RhbF9mYWN0b3IgPSAxLjAwOwoKZnVuY3Rpb24gd3JpdGVPdXQobGluZSkgewogICBpZiAob3V0cHV0KSB7CiAgICAgIG91dHB1dC5hcHBlbmRDaGlsZChkb2N1bWVudC5jcmVhdGVFbGVtZW50KCJiciIpKTsKICAgfSBlbHNlIHsKICAgICAgb3V0cHV0ID0gZG9jdW1lbnQuZ2V0RWxlbWVudEJ5SWQoIm91dHB1dEFyZWEiKTsKICAgfQogICBvdXRwdXQuYXBwZW5kQ2hpbGQoZG9jdW1lbnQuY3JlYXRlVGV4dE5vZGUobGluZSkpOwp9CgpmdW5jdGlvbiB1cGRhdGUoKQp7CiAgICB2YXIgYW1vdW50ID0gTnVtYmVyKGV2YWwoaW5Gb3JtLm51bS52YWx1ZSkpOwogICAgaWYgKGFtb3VudCA9PSBOYU4pIHsKICAgICAgICB3cml0ZU91dCgnSHVoPycpOwogICAgfQogICAgdmFyIHRpcCAgPSBhbW91bnQgKiBwZXJjZW50OwogICAgdmFyIHJ0aXAgPSBNYXRoLnJvdW5kKHRpcCAvIHJ0aXBfZmFjdG9yKSAqIHJ0aXBfZmFjdG9yOwogICAgd3JpdGVPdXQoJyQnICsgYW1vdW50LnRvRml4ZWQoMikgKyAnICsgJCcgKyBydGlwLnRvRml4ZWQoMikgKyAnID0gJCcgKyAoYW1vdW50K3J0aXApLnRvRml4ZWQoMikpOwogICAgdmFyIHJ0b3RhbCA9IE1hdGgucm91bmQoKGFtb3VudCArIHRpcCkgLyBydG90YWxfZmFjdG9yICsgMC4yNSkgKiBydG90YWxfZmFjdG9yOwogICAgd3JpdGVPdXQoJyQnICsgYW1vdW50LnRvRml4ZWQoMikgKyAnICsgJCcgKyAocnRvdGFsLWFtb3VudCkudG9GaXhlZCgyKSArICcgPSAkJyArIHJ0b3RhbC50b0ZpeGVkKDIpKTsKICAgIAp9CgpmdW5jdGlvbiB6YXAoZmllbGQpIHsKICAgIGlmICghIGZpZWxkLnphcHBlZCApIHsKICAgICAgICBmaWVsZC56YXBwZWQgPSB0cnVlOwogICAgICAgIGZpZWxkLnZhbHVlICA9ICIiOwogICAgfQp9Cgo8L3NjcmlwdD4KCjwvaGVhZD4KPGJvZHk+Cgo8aDE+VGlwIENhbGN1bGF0b3I8L2gxPgoKPHA+CkVudGVyIGFmdGVyLXRheCB0b3RhbDoKPC9wPgo8Zm9ybSBuYW1lPSJpbkZvcm0iIG9uU3VibWl0PSJ1cGRhdGUoKTsgcmV0dXJuIGZhbHNlOyI+CjxpbnB1dCB0eXBlPSJ0ZXh0IiBuYW1lPSJudW0iIG9uRm9jdXM9InphcCh0aGlzKSIgdmFsdWU9IjI0LjM3IiAvPgo8aW5wdXQgbmFtZT0ic3VibWl0IiB0eXBlPSJidXR0b24iIG9uQ2xpY2s9InVwZGF0ZSgpOyIgdmFsdWU9IlRpcCIgLz4KPC9mb3JtPgoKPHAgaWQ9Im91dHB1dEFyZWEiPgo8L3A+Cgo8L2JvZHk+CjwvaHRtbD4=

By putting images inline using data: URLs, you can create pretty rich pages and store them locally. I created a 363,488 byte URL for my home page (complete with images) and it loaded just fine on my iPhone.

Here's a quick Perl one-liner to turn HTML into a data: URL.

perl -0777 -e 'use MIME::Base64; $text = <>; $text = encode_base64($text); $text =~ s/\s+//g; print "data:text/html;charset=utf-8;base64,$text\n";'

By making these links programmatically, you even have an ugly hack to do persistent storage on the iPhone. Just encapsulate your app and its state in its URL.

Sunday 22 April 2007

Good Hash Functions

I happened to want to create a hash table with integer keys and went looking for a suitable function. As usual, Google is your friend. And as usual, once you start researching things on the 'net, hours can go by.

Thomas Wang has a good discussion of various integer hash functions, but that also lead me elsewhere to discussions of good hash functions in general.

In the past, I've found that many of the hash functions that are claimed as being better than Knuth's classic string hash function don't actually prove to be any better by most metrics, and some seem to be much worse.

For example, one popular hash on the street these days seems to be Paul Hsieh's SuperFastHash. It does run quickly, and on the whole its statistical properties seem to shake out reasonably well. But when you look at the actual integers it returns, in my tests using /usr/share/dict/web2 on my Mac, there seem to be a far more collisions than you'd statistically expect. Statisitically, you'd expect about six collisions in the 32-bit space. Knuth's hash function has only five, and they're very dissimilar words, namely:

227010540:  autovivisection grovelings
890239928:  dialypetalous mumpishness
2851341963: anisostemonous umbellifer
3508170762: ctenodactyl fuliginousness
3909438781: prerogativity puzzleheaded

The SuperFastHash function, on the other hand has 59 collisions, an order of magnitude more. Here are a representative few:

432696082: Cotinga Cotonam
535511585: miscoin misfond
631000912: amidine aminity
668950620: untossed unworked
738886349: hennin penman
749072160: revisible rewirable

Notice that the words that hash the same seem somehow similar. That's just weird.

In addition to his own hash function, Paul Hsieh also has some other useful code on his site, including a hash test program comparing several different hash implementations for speed, and a portable implementation of stdint.h.

The FNV (a.k.a. Fowler/Noll/Vo) hash is another hash function that seems popular these days. It seems broadly similar to Knuth's hash function, but does a better job of distributing hashes for short strings across the full 32-bit space for hashes. For example, Knuth's hash hashes bat and cat to 137867 and 139236 respectively, but FNV hashes them to 950299920 and 1587996537. Like Knuth's hash function, FNV places single-letter words in adjacent spots (although there is an alternative version, FNV-1a, that avoids this problem), Here are the collisions in 32-bit space for FNV.

374764810:  diabolically koilanaglyphic
1055878936: deuteropathic vertebrosacral
1290893597: parer vila
1408982841: basiotribe narcotinic
1713658462: averral climatical
3129894270: Scorpididae transposer

Bob Jenkins published an article in Dr. Dobbs journal in 1997, providing a good hash function of his own, and has continued to tweak his code since. His page on hashing has lots of good stuff, including links to his code. His hash function is no slouch, and is the only one I looked at that maps single characters to radically different positions. Below are his 32-bit collisions, again with about the distribution you'd expect:

728135544: chorda fingerbreadth
733592810: stockily virginally
893264706: combaron unlimited
1456871225: gaspingly secularistic
1486736111: unbodied Yankee
2683815022: blackpoll Paharia
2947362466: Borinqueno unskewed
3298503807: distributress granulator

Thanks to Paul Hsieh's test program, here are some performance numbers for these different implementations (as benchmarked on my aging PowerBook G4):

FNVHash         :  3.9300s
knuthHash       :  2.9700s
BobJenkins      :  2.4600s
SuperFastHash   :  2.2800s

Running on some other architectures, I find that FNV and Knuth are really about the same (the difference between the two seems to be a G4 artifact). On the whole, although it may look like there's a big difference between the algorithms, in my experience, I've found that the hash function, (or even the whole hash table implementation!) isn't really the bottleneck. In other words, if you make your hash function twice as fast, usually no one will notice.

Paul Hsieh's SuperFastHash may be a tiny bit faster than Bob Jenkins's hash, but I think not enough to really stand out, and its strange collisions worry me. Bob Jenkins's hash function is probably the best and the one to use if you want an industrial-strength hash, but it is massive and complex. FNV may be slower, but it's short and sweet, just two mystery constants to remember. But if I have to write it myself, from memory, I'm still going to go with Knuth. Usually, Knuth's slightly odd pattern really won't matter.

For more, see Wikipedia's coverage of hash tables, which also has pretty good coverage of hash functions.

Friday 16 February 2007

Getting Backtraces with Standard ML

I still have a pretty good soft spot for Standard ML. Haskell may be sexier, but whenever I want to get something serious done, I find myself turning to SML.

One of my occasional claims for why I sick with ML is that you can actually debug SML programs (c.f., Haskell, where being laziness makes debugging "interesting" -- great if you want a research project).

But in practice, debugging in SML/NJ can actually be a pain. If you get an exception from one of the library functions, you may end up with an unhelpful error message like this one:

    uncaught exception Domain [domain error]
      raised at: Basis/Implementation/real64.sml:88.32-88.46

Without any sort of backtrace, you get no clue about where/how the exception was raised.

But it turns out that there is a feature in SML/NJ that lets you get a backtrace. It's just that it's barely documented at all!

It turns out that if you type:

    CM.make "$smlnj-tdp/back-trace.cm";
    SMLofNJ.Internals.TDP.mode := true;

when you first start SML, and then compile your code, when you get an exception, you'll get a backtrace.

Now, you'll see something more like:

    CALL   art.sml:52.7-52.55: Art.toIntensity[2]
              (from: art.sml:89.38-89.57: Art.emitGray[2].iz)
    CALL   art.sml:79.27-93.33: Art.emitGray[2]
              (from: art.sml:13.26-13.29: Art.for[2])
    GOTO   art.sml:10.7-13.45: Art.for[2]
              (from: art.sml:78.22-93.34: Art.emitGray[2])
    CALL   art.sml:77.19-93.34: Art.emitGray[2]
              (from: art.sml:13.26-13.29: Art.for[2])
    CALL   art.sml:10.7-13.45: Art.for[2]
              (from: art.sml:75.14-93.35: Art.emitGray[2])
    CALL   art.sml:64.7-98.7: Art.emitGray[2]
              (from: ???)
    CALL   art.sml:249.7-307.9: Art.doMix[2]
              (from: ???)
    
    uncaught exception Domain [domain error]
      raised at: Basis/Implementation/real64.sml:88.32-88.46

Cool. You've got to wonder though, why people would write a cool and useful feature like this and not clearly tell people about it.

Monday 12 February 2007

Use Google to * Yourself

In bored moments, I sometimes wish that Google had a "Just show my some random interesting thing" button in addition to its "I'm feeling lucky" button. It doesn't, but if you're after something relatively random and occasionally worthy of a chuckle, Google is your still friend.

Google's search facilities allow you to include wildcards in your searches. You can't just search for "*", but you can make seed phrases. It's often quite strange what the top hit is. Here are a couple, and their top hits as of today,

And so on...

Certainly a silly time waster, but if you hopefully you won't waste a whole decade.

Sunday 11 February 2007

Universal Binaries without XCode

OS X inherited fat binary technology from NextStep. Back in the NextStep days, the incantation was easy, you'd just add -arch i386 -arch ppc to all your compilation/linking/library commands and you'd be all set.

With OS X, Apple made it "even easier" — just a check box in XCode. And for projects that still use things like Makefiles, they give you detailed instructions for Building an Open Source Universal Binary. Great right? Not so much, because the those instructions essentially tell you how to make XCode manage the whole build, which is, frankly, nuts.

If you try to go old school and pass -arch i386 -arch ppc to gcc, all seems fine until you try to link, and which point it dies horribly. Turns out that the "standard" developer libraries are thin, not fat. So, to link your program, you need to pass -syslibroot /Developer/SDKs/MacOSX10.4u.sdk to the linker to have it find some libraries with the proper amount of universal goodness. For a C++ project, the relevant incantation is

   g++ -Wl,-syslibroot,/Developer/SDKs/MacOSX10.4u.sdk -arch i386 -arch ppc

Why didn't they just say that? Maybe they were too embarrassed...

Understanding mdfind

I love Apple's Spotlight in theory, but in practice I hate the GUI implementation. On my 1.33Mhz G4 laptop, having it try to search while I'm still trying to type the first is both painful and usually useless. And, to make matters worse, when using Spotlight from the Finder, it seems to crash the finder about 50% of the time.

So, half the time I end up just using locate or find. But I keep thinking that I should really be using mdfind. But the manual page is less than helpful. It says

The query can be a string or a query expression.

which isn't very helpful because it doesn't tell you what valid query expressions look like.

Some people tell you to just do a search in the Finder, save it as a saved search/smart folder, and then go peek in ~/Library/Saved Searches. There you'll find evil XML (or binary) plist files, which are unreadable by normal people. But here's a useful trick to render them in human readable form (i.e., the old-style ASCII plist format). If the file ends with a .plist extension (e.g., ~/blah/foobar.plist), you can use defaults read ~/blah/foobar (note the lack of the .plist — the defaults command insists on adding it). If it doesn't end with .plist, you can make a temporary symlink that does.

So, you can look at saved searches, but learning from examples only goes so far. And, you can use mdls on existing known files to find potential attributes to use in your search.

But what you really need is to know the query expression syntax, and the metadata attributes. Why they can't just tell you about these references on the mdfind man page I don't know.

This documentation is a good start, except that it is still fairly sparse. From what I can tell, the inRange operator doesn't work, or at least doesn't work on dates. This works to find the files I've changed in the last day:

   mdfind '(kMDItemFSContentChangeDate >= $time.today(-1)) && (kMDItemFSContentChangeDate < $time.now) && (kMDItemContentTypeTree = "public.content")'

but this one doesn't

   mdfind '(inRange(kMDItemFSContentChangeDate,$time.today(-1),$time.now)) && (kMDItemContentTypeTree = "public.content")'

The kMDItemContentTypeTree = "public.content" part is to weed out updated cache files and the like, although apparently Makefiles don't qualify as content (none of the importers recognize them, I guess), so they get weeded out too. sigh

Still, I'm closer than ever to weaning myself off locate and find. Maybe.

Fun with Lego Mindstorms NXT

C got me an Lego Mindstorms NXT for my birthday. A great birthday present is something you're pleased to have been given, but would never have bought for yourself, either because you hard to justify or because you would never have thought of it. I think this qualified on the former count — I knew if I got one, it'd be a terrible time sink. But I forgot Bertrand Russell's quote that "The time you enjoy wasting is not wasted time". I've certainly enjoyed almost all the time I've wasted on NXT fun (even browsing the technical docs).

It seems to be developing quite a community. There are several blogs, including

There are also a ton of ways to program the thing. I've mostly used the provided NXT-G graphical programming environment and NXC.

It's also pretty amazing what people have managed to do with NXT-G. For example, I'd never have attempted something as complex as a radar display what struck me as a fairly primitive and awkward language. I guess it's more capable than I thought, but I still think it's an insanely awkward way to express anything remotely complex.

I've built all the roaming robot designs that come with the set, but I like the basic TriBot best for versatility. I need to try doing some other designs too. LegoEdWest has build instructions for various straightforward variations on the original theme, such as Brian Davis's JennToo robot.

For coolness-factor, here's a Segway clone built using just the light sensor (and another cooler one), and a PackBot clone (which requires various extra parts to build.

And if you want more fun than what comes in the box, there are some really interesting hardware pieces on the horizon, including compass, acceleration, and gyro sensors, as well as input and output multiplexors. See

There sure is a lot going on here.

Saturday 10 February 2007

Cars, Priores/Priuses/Prii

So, C was idly looking at new car websites last night. We're both always frustrated by the fact that most cars are available in such bland colors. I mean, if you're going to spend hours researching your new car, spend more than $20,000 on it, and have to wait weeks or months for it to arrive, you might at least be able to get it in a range of colors at least as interesting as a $69 iPod.

Myself, I keep hoping that the rumored 2008 Prius will be totally overwhelmingly cool and come in some actual colors. But I won't hold my breath.

And the official plural of prius? Apparently it's Prius. Wow, how bland, what a surprise.

SIGCSE booked

So, I'm headed to SIGCSE. Registering on the last day of early registration is pretty dumb since all the conference hotels are pretty-much full, and flights are about $550. On the other hand, if I'd waited even longer, I would have felt even dumber.

There are lots of interesting things on the program.

I'll have to try to find out more about the vicinity (Covington, Kentucky) .

Here goes nothing...

So, apparently my domain now has blogging support. What does this mean in practice...? I don't know, but I've had this domain for years, doing pretty-much nothing. Maybe I should try to actually use it for something.

So, as an experiment, I'm going to try seeing if I can use it as a (rather public) way to record things I think are interesting.