Drycleaning your data

July 22nd, 2008 by Merrilee

This blog posting on BoingBoing caught my eye because it invoked both my beloved iTunes and my dreaded iTunes metadata. I am not a cataloger (not by any stretch!) but I am so completely disturbed by the embarassing disarray of metadata that my iTunes library represents. And it’s not just that I am (in my Virgo way) bothered by the lack of consistency and order. The data does not support some basic functions. Tracks that are labeled “track 1″ prevent me from finding the song I want. Tags for compilation albums or classical music often lacks data necessary for searching or sorting (is the artist the name of the album or the actual artist? composer or the orchestra? this is treated differently by different people). If I wanted to pull together a funk compilation, I couldn’t do it based on the metadata I have because the genres have been supplied by the wisdom of the crowds and has not been normalized. And the wisdom of Apple dictates that something can only have one genre. This is disappointing to me, because based on my collection, I could put together a funk mix that would knock your socks off.

Enter, TuneUp.

…a plug-in for iTunes that cleans up your library’s metadata and grabs the missing album cover art. It takes an “audio fingerprint” of each track and then gets the appropriate data from Gracenote’s Global Media Database. It’ll also let you know if you’re missing any tracks from a particular album…. The company claims they’re averaging a correct rate of 85 to 90 percent. A quick flip through my library makes me think it worked even better than that for the metadata and about that well for album art.

TuneUp Companion has several other features that I haven’t personally seen in action. It grabs contextual content from various places online. For example, if you’re listening to “Creep” by Radiohead, the “Now Playing” feature will check YouTube for live videos of the song and search for bio info and music news. The Concert feature looks for tour information and can be set to alert you if a band is coming to your town. Gabe told me they’re planning to open up the “Now Playing” API so anyone can create their own contextual content features.

Much of my iTunes metadata comes from FreeDB (an alternative to Gracenote). Although some of the other features are interesting, I’m most interested in cleaning up my existing data.

For me though, the Clean feature is the big selling point. TuneUp costs $12 a year or $20 for a lifetime of use.

Note that the service costs. A small amount, and relative to the amount of time I could spend on this task, a bargain.

What would be interesting is a cleanup of the data sources themselves. If the data cleanup that I did was pushed back into FreeDB, then there would not be so much data in need of cleanup in the first place. It’s the synchronization of the cleanup that is the interesting and challenging part, but where most of the rewards could be reaped.

I think about this in terms of other piles of individually created and pooled data, like the bibliographic records in WorldCat. We see a lot of inconsistencies (reflecting changes in practice, changes to the MARC standard, limitations of local systems where data was created, etc.). Could the data in WorldCat be strategically “dry cleaned” for maximum benefit and then returned to local systems?

Related posts:

One Response to “Drycleaning your data”

  1. Amelia Says:

    It’s funny, a friend of mine who’s an obsessive record nerd gives me a hard time about my messy i tunes metadata. Honestly, I blame I Tunes and the ipod interface for their poor recall paths.