18 Mar 2004

orkut analysis

last night i hacked up a python script to leech the network of friends i have on orkut. using 2 levels of friends, eg. counting my friends and their friends, i'm apparently linked to about 294 unique people. so that makes me wonder how orkut can even claim that i'm linked to 140000 people. either they're not all unique on average, those 294 people have 476 unique friends among them. either way, that number on orkut is not to be trusted.

anyway, so using graphviz (pixelglow's graphviz for macosx), i plotted my network of friends and its pretty interesting. among my network, i'm not the one with the most friends. in the picture, there's a weird centre where i'm thinking many people are massively linked to each other. i'm on the outskirts of the graph. if you plot a hierarchical graph, you can see how there are branches of friends which are either related to the geography or socially. for instance, there's a huge group of gentoo developers in one the middle, a bunch of japanese people in another cluster, and some cambridge people in a weak cluster.

the graph takes about a minute to render on my powerbook 1GHz, depending on options. here is a much better PDF graph which you can zoom in on. this is just some toy thing i'm playing with as i'm really interested in visualising social networks. notice that orkut's policy is that you shouldn't be using bots on the site, but i'm doing it for my personal research and not to leach everyone's information, even tho that is pretty easy.

and if someone who admins orkut see this, there's a malformed HTML table in the FriendsList.aspx code.

i'll think about releasing the code once i get a chance to optimise it a bit, right now its just a bunch of hacked up python scripts generating graphviz dot files. i'm thinking of the next thing i can do with this, maybe draw everyone's faces on graphviz (i wonder if that is possible), or index community linkages. doing this makes me think of some cool ways to visualise the data. like if orkut included GPS coordinates or lat/long coordinates, we could plot this on a world map. another would be to be able to see which people are linked to you via communities, instead of just friends. another would be to colour the links depending on how you rank people, or if they've written a testimonial. weighted links is the word i'm looking for. make communities a "friend" so that you can be linked to people via a community. there's heaps of stuff you can do on a social networking site, its quite fun.

i wish orkut would open up their api so that we can all play around with that data rather than having to write my own bot to leech stuff.

You can reply to me about this on Twitter: