28 Jan 2006

opera mini browser for your mobile rocks!

if you've browsed the web before on your mobile, you'll know how slow and frustrating it is. that is because no good ui designers wrote software for phones. finally, opera has applied some of their know how to pioneer the mobile web browser market.

the opera mini browser really does rock. i've been using it to browse all sorts of things like bbc news and god forbid, liquidx.net and it works great. i was thinking about adding a wml version of my page just for fun, but it looks like this browser is good enough to use verbatim on my site. i believe it funnels the traffic through some sort of rewriter that shrinks all images.

 Img Screenshots Operamini En Google En

if you go to the bbc website, all the images are shrunken down so they look good enough on your phone, but small enough that it won't bankrupt you. i think the real test would be to use it to browse fark.com or flickr. and reportedly javascript works too, but no i don't want scripactulous stuff on my phone!

anyway, the point is, if you've been skeptical about web browsers on your phone, you should give opera mini a try!

... Read More

28 Jan 2006

important things to note about lighttpd

here are some things you need to know about lighttpd as i experienced at 1.4.8.

1. it has an incredibly small memory footprint. 2M! it has great facilities to load balance things across multiple servers.
2. it's configuration is very plain and simple. no bloody apache "UnbreakMe Yes" options everywhere.
3. webdav support is not quite there yet, doesn't work with mac os x tiger's finder. actually, it is disastrous to use finder with lighttpd. if you turn on writable, it will infinitely request for properties that the server doesn't respond correctly.
4. if you use fastcgi to self spawn a cgi handler, it is very likely to fail and turn into an infinite loop disaster in the logs! i believe this is a known bug and fixed in 1.4.9.

otherwise, its just been great! hopefully they keep this up, i'm glad i've started making the switch. once i convince all the others on the colo to switch over the lighttpd, the world will be a better place! hahaha

... Read More

28 Jan 2006

status update: mostly linux geekery

been quite busy recently, with my weird sleeping patterns, distractions and changes in my life -- there's not been too much time.
upgrade to lighttpd

the server that hosts this blog and all my other webby stuff has been up and down. that's because apache has finally decided to gobble up all the memory and turn the server into a memory swapping party. turns out that enabling mod_python and serving 4 django projects off apache wasn't such a good idea. each apache process was taking nearly 17M RSS.

i started migrating all my django projects, which includes my blog, my own personal feedfilter and two private projects, and a trac installation to lighttpd. i chose lighttpd because i've heard alot of good things about it, and it is also written by the same guy who wrote modlogan, a web log analyzer that i used for many years, all the way back in my CSE days (thats 4-5 years ago)!

turns out it was a good choice, lighttpd is not as mature as apache, but it does simple things very well. the server is not multiprocess, so it runs very light and fast, weighing in at 2M RSS total (compare that to nearly 100M apache was consuming with mod_python, mod_php with 5 processes). the flip side is that all your services have to run outside of the server. with 4 django projects and 1 trac project, you're looking at 5 python servers, each with 2 processes, clocking in just under 60M.

nice thing with lighttpd is it is so easy to understand where things are going, not like apache. i know that the request comes in, goes thru a bit of rewriting and out to the python servers using fastcgi. ideally i'd like to get rid of all of apache, but right now it is still the front end to the servers and it proxies the request to the lighttpd. also there are just some rewrite rules that don't work well under lighttpd, plus mod_security is an indispensable spam filter. other than that, there's no reason to continue to stick with apache.

things i learnt while tuning the server

1. mod_python adds about 4-6M to the RSS of an apache process.
2. mod_php adds around 6-7M to the RSS of an apache process.
3. mysql 4.0 leaks memory for some reason, restarting it once in a while helps.
4. postgresql is as stable and lean as a rock. i'm very impressed with it.
5. using django in fastcgi means around 8-10M RSS processes depending on how many databases or complicated your web app is.
6. if you use mod_proxy using ProxyPass, MAKE SURE YOU USE ProxyPassReverse.

final note
the server is much more responsive now, and plus the fine people at bytemark.co.uk which we run our server off has just doubled the bandwidth (luckily, because i ran over last month!) and increased ram and disk space. damn server tuning is addictive.
oh yeah, i have my own fastcgi launcher scripts that launch django and trac projects inspired by the django fastcgi lighttpd howto. if anyone wants them, i might put them online.

... Read More

24 Jan 2006

woodbridge fags, for men.

i found this photo in my cameraphone. i took on christmas eve in a pub in woodbridge.

Woodbridge Fags

it's da man's fag. this picture was most likely taken under the influence.

... Read More

23 Jan 2006

spotlight limitations

yet another metadata related post. if you're bored about this topic, then i suggest you stop reading here.

ok, so i'm continually running into the same limitations that people have been running into since mac os x 10.4 came out back in mid-2005. the most recent limitation is related to the way spotlight importers work. for those who don't know spotlight importers are system level plugins that are called whenever spotlight thinks it needs to update the metadata for a file.

the way it works is that you register for a file type you would like to handle, for instance, the UTI com.adobe.pdf would allow you to generate metadata for PDF files. file types a defined in a class hierarchy graph where com.adobe.pdf is actually a subtype of public.content which in turn is a subclass of public.data. so being naive, i registered to handle all files that have type public.data, meaning that any file with data (eg. everything) will use my importer.

limitation 1. doesn't actually work like that. firstly, spotlight calls one and only one importer. so it works from the most specific importer down to the most generic one. so in my case, my importer won't get run for any PDF files. the obvious workaround is the then register for all the types i know and exhaustively list them.

limitation 2. if there are two importers that are claiming a file type (UTI), then spotlight only runs one of them. so if in the case of the creative-commons spotlight importer for mp3 files, well that killed the mac os x default importer for audio files for mp3s. hence, if you installed that, you'd lose all your ID3 tags in the spotlight search.

i'm not the first one to run into these limitations. a couple of people have already done so. one of them has been working on a similar problem of tagging all files on the file system, he is calling it SpotMeta. he seems much further along than i am. he's even gone as far as to add the missing features that apple should of added such as allowing stacking of spotlight importers. i am guessing he does this by running a separate server to update metadata in parallel to the spotlight server (mds) and running those so called spotlight importers and injecting them into the server using the undocumented [NSMetadataItem setAttribute:forKey:] call.

so the conclusion is that tagging can't be done quite as cleanly as one thought, even with all the power that spotlight has, there are just some small niggling limitations. i suppose that is expected for a 1.0 try, i hope that mac os x 10.5 will reallty unlock the potential of spotlight by allowing some of these things that we all expect.

in the meantime, i can probably get by abusing the finder comment field which seems to be stored both in the extended attributes of the file system and in the metadata server. [UPDATE: whoops - i jumped the gun a little, finder only stores creator types in that space, not the spotlight comments.] i'll probably find some other way to get around this sooner or later, just that it kills me to know that apple is going to fix this in 12 months time -- the choice is either to wait 12 months or figure out a hack now. grr.

btw, here's a quick peek at some cool interface stuff i've been doing with a docked desktop HUD (kinda like those new ones in iphoto 6):

Violator Sneak

click on the image for a short video (13K) on how it moves. the video is a bit jerky, unfortunately.

... Read More

17 Jan 2006

pros and cons of using preview to keyword files

i still haven't given up on my idea to tag files on my mac.

i've been noticing 10.4's preview.app's ability to keyword particular types of files such as PDF and JPEG. with JPEG it is pretty standard what it is doing. preview is adding IPTC tags to the actual file! yes, its actually doing the correct thing unlike iphoto. if you fire up preview and open an image you have, in my case, i have a photo taken at the apple store in london on new years eve :)

[sidenote: no, they don't let you countdown for new years at the apple store .. :P]

Preview Tag

so you get there by press Cmd+I. when you do save the file, look in finder, and the file has the proper spotlight keywords attribute -- as expected.

Preview Getinfo

so notice the keyword field is there, and also the email address of one of the computers in the apple store, if you were so inclined to give those internet leechers some email :) if you drag that into iphoto, iphoto actually recognises those keywords. i'm not sure whether this is an iphoto 6 development or what, but it does get imported.

checking the JPEG out with libiptcdata, it shows that the keywords are indeed in there embedded using IPTC:

Iptc Info

[note: i couldn't get the keyword information out using exiv2's command line tool, even though it claims to extract IPTC information.]

so seems like preview is doing the right thing, which is nice. so somewhere inside preview's code, it actually deals with IPTC code.

however things aren't so rosey for PDFs. i noticed the same keyword panel for pdfs, but do not use it for pdfs! if you use it for PDFs and then save the file, you will lose your table of contents and maybe some other useful data that is in the formatting. why?

because preview is using PDFKit to set the document attributes, via [PDFDocument documentAttributes]. not only does it strip the table of contents off the PDF file, if it had any, it also grows the file size by 20-50% depending on how the file was made. it seems like apple's own PDF generation is pretty lossy.

i haven't found anyone tackling this issue of preserving the PDF file contents but altering the metadata. is it that difficult to do? there are some ways to get the metadata out, such as using libextractor (i only realised because there is a security vulnerability for exactly this feature!) but no one seems to have written an libexif or libiptcdata type library to edit the data inside but preserve the general structure of the PDF.

to go back to the first paragraph. this is all just messing around. i've rediscovered the joys of extended attributes after some prodding. xattr is part of the 10.4 BSD layer, and that means we can store all sorts of goodies in there with a nice API. my new approach to tagging files will be to store the tags in the extended attributes space (allows for up to 4K of data -- plenty if you're only storing text tags and some author, title information) and then writing a spotlight importer to just simply extract those things out and put it into the spotlight database. simple. so simple that i think someone might of done it already. if not, i'm going to take a swipe at getting it working. this definitely beats dealing with mac os x's crazy mds (metadata server).

for those who want to know more about xattr, check out the ars technica description or the extremely simple xattr python module by Bob Ippolito.

... Read More

17 Jan 2006

creation time of bundled mac applications

i just noticed something funny today, the creation time of bundled applications on mac os x is 1st april 1976, also the date apple computer was founded. this is something he mentioned in his keynote about this year being the 30th anniversary of apple computer.


there's your pointless fact of the day. back to work now.

... Read More

16 Jan 2006

life 06 not for G3s

did anybody else notice that ilife 06 doesn't support G3s?

... Read More

14 Jan 2006

hyperwrt on linksys wrt54g

where i live, we've got a very hackable linksys wrt54g. recently i came across a page on utorrent.com about problems that might occur with wrt54g's default firmware when using their software.

that wasn't too interesting, but it had links to installation instructions for hyperwrt and dd-wrt. i tried hyperwrt+tofu because they had a nicer homepage and the instructions were more clearer. turns out after 10 minutes and 2 reboots, i can now telnet into my wrt54g and also have a full source tree where i can fiddle around. it is a little different from openwrt because i believe that is built from the ground up whereas this is just patches against the official linksys source.

# uname -a
Linux teh 2.4.20 #1 Sun Jan 8 09:19:12 PST 2006 mips unknown
# cat /proc/cpuinfo
system type : Broadcom BCM947XX
processor : 0
cpu model : BCM3302 V0.7
BogoMIPS : 199.47

very cool! now i have to think of what i should run on it.

... Read More

12 Jan 2006

album art widget 2.5 - chinese support update

what started off as a quick exercise to throw out a universal binary update (in light of the actual release of intel macs) turned into a bug fix release for specifically the yesasia portion of the album art fetching. the new version contains only three changes:

1. universal binary/widgetplugin. so in theory should work on intel macs, but i don't have one to test it on (*hint hint*)

2. improves searching chinese artists on yesasia by fixing my stupid assumption that the chinese characters appear before any english name in the artist name. now its much smarter about that.

3. yesasia searching is now working again.

i should use this opportunity to thank PP for helping me beta test every release before it hits the server. and every time she uncovers some critical bug or crappy design decision i made that turns out to be immensely useful! ;)

again, download the new version from the album art widget page.

... Read More