23 Jan 2006

spotlight limitations

yet another metadata related post. if you're bored about this topic, then i suggest you stop reading here.

ok, so i'm continually running into the same limitations that people have been running into since mac os x 10.4 came out back in mid-2005. the most recent limitation is related to the way spotlight importers work. for those who don't know spotlight importers are system level plugins that are called whenever spotlight thinks it needs to update the metadata for a file.

the way it works is that you register for a file type you would like to handle, for instance, the UTI com.adobe.pdf would allow you to generate metadata for PDF files. file types a defined in a class hierarchy graph where com.adobe.pdf is actually a subtype of public.content which in turn is a subclass of public.data. so being naive, i registered to handle all files that have type public.data, meaning that any file with data (eg. everything) will use my importer.

limitation 1. doesn't actually work like that. firstly, spotlight calls one and only one importer. so it works from the most specific importer down to the most generic one. so in my case, my importer won't get run for any PDF files. the obvious workaround is the then register for all the types i know and exhaustively list them.

limitation 2. if there are two importers that are claiming a file type (UTI), then spotlight only runs one of them. so if in the case of the creative-commons spotlight importer for mp3 files, well that killed the mac os x default importer for audio files for mp3s. hence, if you installed that, you'd lose all your ID3 tags in the spotlight search.

i'm not the first one to run into these limitations. a couple of people have already done so. one of them has been working on a similar problem of tagging all files on the file system, he is calling it SpotMeta. he seems much further along than i am. he's even gone as far as to add the missing features that apple should of added such as allowing stacking of spotlight importers. i am guessing he does this by running a separate server to update metadata in parallel to the spotlight server (mds) and running those so called spotlight importers and injecting them into the server using the undocumented [NSMetadataItem setAttribute:forKey:] call.

so the conclusion is that tagging can't be done quite as cleanly as one thought, even with all the power that spotlight has, there are just some small niggling limitations. i suppose that is expected for a 1.0 try, i hope that mac os x 10.5 will reallty unlock the potential of spotlight by allowing some of these things that we all expect.

in the meantime, i can probably get by abusing the finder comment field which seems to be stored both in the extended attributes of the file system and in the metadata server. [UPDATE: whoops - i jumped the gun a little, finder only stores creator types in that space, not the spotlight comments.] i'll probably find some other way to get around this sooner or later, just that it kills me to know that apple is going to fix this in 12 months time -- the choice is either to wait 12 months or figure out a hack now. grr.

btw, here's a quick peek at some cool interface stuff i've been doing with a docked desktop HUD (kinda like those new ones in iphoto 6):

Violator Sneak

click on the image for a short video (13K) on how it moves. the video is a bit jerky, unfortunately.

You can reply to me about this on Twitter: