Graphing Your Taste in Music

Came up with this idea a few days ago when I took a look at the Amazon Web Service. I had the idea that if I was to export my iTunes song list (File > Export Song List), and then pass each Artist/Album combination into the Amazon Web Service “Similarity Lookup” function, then I’d be able to build a graph of my musical taste. Such a graph could help you work out potential albums to purchase by creating different coloured nodes for currently unowned albums that would create many ‘edges’ when connected with owned album nodes. As a starting point, this post discusses the implementation of a perl script which takes an iTunes song dump, feeds it through the Amazon Web Service, and is finally output into an xml format which can be read by Prefuse (an interactive visualization toolkit).
While parsing the iTunes dump line by line, we need to get the Amazon ASIN for each album. We perform a query to find the best match from our ID3 tags; and save the result into a global hash. The following method affords us this functionality.
sub get_album {
my $artist = shift;
my $album = shift;
my $result = shift;
my $content = get("http://webservices.amazon.com/onca/xml?" .
"Service=AWSECommerceService&SubscriptionId=" . KEY ."&" .
"Operation=ItemSearch&SearchIndex=Music&Artist=$artist&" .
"Title=$album&Version=2005-03-23");
my $xp = XML::XPath->new(xml => $content);
if ( ! $xp->find("//Error") ) {
$$result{ARTIST} = $xp->findvalue("/ItemSearchResponse/
Items/Item[1]/ItemAttributes/Artist”)->string_value;
$$result{ALBUM} = $xp->findvalue(”/ItemSearchResponse/
Items/Item[1]/ItemAttributes/Title”)->string_value;
$$result{ASIN} = $xp->findvalue(”/ItemSearchResponse/
Items/Item[1]/ASIN”)->string_value;
}
}
Now once you’ve got the ASIN for an album; you can use Amazon to perform a similarity lookup which will return a list of ASINs that are deemed to be similar to the provided ASIN. This information also gets placed in our hash. The following method does the job…
sub get_related {
my $asin = shift;
my @r_ref = shift;
my $related = \@r_ref;
my $content = get("http://webservices.amazon.com/onca/xml?" .
"Service=AWSECommerceService&SubscriptionId=" . KEY .
"&Operation=SimilarityLookup&ItemId=$asin");
my $xp = XML::XPath->new(xml => $content);
if ( ! $xp->find("//Error") ) {
for (my $i = 1; $i < = 20; $i++) {
if ( ! $xp->find(”/SimilarityLookupResponse/
Items/Item[$i]“) ) {
last;
}
@related = (@related,$xp->findvalue(
“/SimilarityLookupResponse/Items/Item[$i]/ASIN”)
->string_value);
}
}
}
Having built a hash of all the album relations, I export the hash back out in the XML format required for Prefuse. I won’t put that source inlined in this post, but the full source for the application is available for download below. Now whilst I’d really like to enable the Java applet on this blog some time soon, for now I’m showing some screenshots of the applet and looking at the results.
The first picture below is the zoomed out version of all the albums in my iTunes DB that Amazon deems as being related to another item in my DB. Now whilst there are a number of sparse enteries, for the most part most of my music is related through a number of degrees of seperation to the rest of the Library.
Now the great thing about the Prefuse undirected graph visualization tool is that it enables you to pass over nodes which it will highlight in red, and then have all neighbour nodes and edges light up in orange. This helps to visually disern some rather interesting information. If we zoom in a bit on the centre cluster and highlight what looks to be one of the more central nodes, we can see many neighbour nodes light up orange. It just so happens I’ve selected “Vertical Horizon” which encapsulates one genre of music I appreciate, so this result is as we’d expect!
If we zoom in further again on that dense top right area of the grab we find it’s dominated by U2, REM, and Coldplay. This picture is zoomed in enough so that you can visually study the relationships between some of the various albums and their relationships as defined by Amazon.
I’m going to be doing a bit more work on this at some point when I get the chance to add specially labelled nodes to the graph which would act as recommendations and will place the updates on this blog. In the meantime if you wish to play with the code yourself there are a few things to note.
- The code above is pretty close, but not exactly the same as the actual code available for download below. I’ve got a few sloppy bits in the real code which are designed to ignore temporary connection errors and to repeat till the instruction gets through. I know there are a few loose edges, it’s a proof of concept and I haven’t spent time tidying it yet.
- You’ll need an Amazon Web Developer key (free from http://www.amazon.com/gp/aws/registration/registration-form.html).
You might need to strip the very first header line of your iTunes dump file. Having gone File > Export Song List, open it in notepad and remove the first line — I’ll fix this in the next release- Once the program finishes running you’ll be left with a results.xml file. Head over to Prefuse for some nice visualisation tools.
- The code is released under the Creative Commons Licence. Essentially I don’t mind what you do with it, but it’s not for commercial use — not that I can see someone doing that!
Download the Perl Source for the “Amazon Music Relationship Finder 1.1″
Note: There were a few Unicode issues in version 1.0 that I’ve now fixed.
Think this is a neat idea or have ideas for extensions? Any comments or questions? Please post a comment below!
Comments(8)

I spent the day out at
The last post discussed the creation of a USB2 HDD with Debian Linux installed which in this second part of the tutorial we’re going to try and boot via a USB2 PCMCIA card on a Latitude C640 which doesn’t support booting from a USB device in the BIOS. So to get started; plug in your USB HDD and boot into Windows.