A couple posts ago I showed a hand-made similarity diagram for The Magnetic Fields, where I started at the last.fm similarity page and followed the top two most similar artists until I'd made a diagram. At the time I wondered how hard it would be to generate these things automatically.
Not too hard, it turns out, but it took me several months to get it all working properly. I wrote a Python script that queries the last.fm database, loading and saving the similarity pages for the artists related to the original query. Because it saves the similarities locally, after a few runs there's not much traffic to the last.fm web site.
Once all the data is collected, it generates a text file of links in the DOT language. These files are processed by the graphviz suite of programs (unflatten and dot, in this case) to produce similarity diagrams like the one below (click on the image to download a full-size PDF, 17 KB).
The diagram was produced by:
./build_and_graph.py -a "The Olivia Tremor Control" -c 60 -r 1 \ | unflatten | dot -Tps > /tmp/graph.ps
Click on the image for a PDF version of the entire graph. The darker and redder the lines, the more similar the two artists are. The options passed to my script control what the initial similarity cutoff is, and what the r-value is for the logistic function that controls how similarity changes as you get farther from the initial artist. For these values, the cutoff starts at 60 for the artists directly connected to The Olivia Tremor Control, rises to 80, then 92, 97, 99 and finally 100. That's why the links all get darker as you move down the diagram, moving farther from the original artist at the top.
Today when I went out to take care of the neighbor's cats I noticed that I'd worn through the leather soles of my cowboy boots. I went upstairs and got my older pair and while I was slipping them on I remembered buying them in California as a pair of boots to go with my new motorcycle. That was in 1991, so those boots are more than 15 years old.
More and more I find myself coming across something that I've owned for more than a decade, or a memory occurs to me from longer ago than seems possible. I'm no longer in my 20's, and even though I know this, it still feels like I'm a young kid who has just graduated from college, just old enough to drink beer. Maybe not yet wise enough to know not to drink too many.
But I'm not that young kid anymore. I've been a lot of places and done a lot of things since then. A lot of it has been chronicled in my journals, or in the other pieces of paper and electronic records that are part of every person's life. I'm thinking it might be fun to make a journal book that would be a sort of lifetime accounting for the places I've been and the things I've done. My journal books are 192 pages, and if I put four months on a page, I can cram around 63 years into the book. With four months per page, that's around six lines per month, which sounds like just the right amount of space. I'll put the details of the book itself on my bookbinding pages.
June 1991: bought motorcycle and cowboy boots.
It's smart to be thoughtful when writing, carefully choosing your words, crafting perfect sentences, reading and re-reading to make sure you haven't written anything stupid. Did I use the same adjective twice? Am I too tentative with my words? Is it good, or worth anyone's time to read it? Especially when your words are out there on The Internets, carried by those imaginary trucks, err., tubes.
I've been thinking about this a lot, as I watch the date on my last post to this blog disappear into the distant past. May 21st? Why haven't I written anything since then? Is there really that little going on in my life?
I've been waiting for the perfect posting. I've been messing around with
graphviz, which will draw directed and undirected graphs (like in my Magnetic Fields Similarity diagram post), and have even written a Python script to automatically generate them for a particular artist. It's awesome. But it's not perfect yet, so I haven't posted, waiting until I've got it down.
In the interim I've gone dip netting (9 red salmon, 1 king, plenty of food for me and heads for the dogs), broke one of my ribs, got snowed on in June during the Monahan Breeding Bird Survey, cut and chopped more than a cord of wood, and finished hanging drywall on the inside of the watershed (not in that order).
And I've downloaded and enjoyed a lot of great music. Right now I'm really enjoying Alligator by The National. I had downloaded Mr. November (track 13) back when I first subscribed to eMusic, but didn't get any other tracks. I finally got the rest today, and it's been on repeat (with the occasional Tokyo Police Club listen, which I also got today) since then. I've got it on the stereo now (AirTunes rocks. . .) and it sounds even better filling a room than boucing around my head via headphones.
In my mind, it fits into a sonic class with Spoon, Interpol and Bloc Party, but I don't know how much of that would bear out in a similarity diagram or in a critical analysis by someone who actually knows something about music. Anyway, the band is tight, and I dig it.
Other things that I've really enjoyed since the last time I wrote are The Fiery Furnaces (can't wait for Matthew Friedberger's solo albums on Tuesday), Girl Talk, The Futureheads and Art Brut. Reading the blogs on the Pitchfork Festival really made me wish I'd taken a trip to Chicago to visit family and attend. Sounded like a blast.
So I started this post off talking about my reticence (I had to look that up to figure out the spelling) about just posting whatever, whenever. Well, here goes. Whatever.
But check out Alligator. It's good stuffs.
I've been listening to a lot of The Magnetic Fields recently and decided to see what sort of data I could collect from last.fm about similar artists. I started on the similarity page for The Magnetic Fields, and wrote down the first two artists. Then I went to each of those artists similarity pages and did the same thing, producing a tree of artists that are similar to one another. The tree of highest similarity led from The Magnetic Fields to Belle & Sebastian to The Arcade Fire, and finally to Bloc Party, which led back to The Arcade Fire. Including the second most similar artists yielded a total of thirteen artists before all the links led to artists I already listed.
Here's what the similarity diagram looks like (click on the diagram for a larger version).
Artists in black are those that were directly connected to The Magnetic Fields or related artists during my initial search, the dark blue arrows are links between the most similar artists and the lighter blue arrows are the number two links. The numbers near the artist boxes are the number of artists that point to them.
After I made the original diagram, I wanted to see where some of my other artists that I thought might fit into the diagram. I added Of Montreal, Architecture in Helsinki, The New Pornographers, and Spoon, again, looking for the top two most popular artists for the new groups. I was certainly right that these artists are similar, although none of them were directly connected to The Magnetic Fields. Of Montreal is only two links away (Of Montreal and The Magnetic Fields both point to Neutral Milk Hotel), and the others I added are three links away.
I think it would be really interesting to figure out some way to generate these sorts of diagrams automatically (since the connections will change as the listening habits of last.fm submitters change), and to add more levels. I used Metapost to build the diagram, so the diagrams could be built programmatically, but it would take some work to read and follow the appropriate links from the last.fm web site.
I had the pleasure of living in Portland for a year in the early 90's, when the alternative music scene was focused on the Northwest and there was good live music playing all the time in local bars and music halls. Most weekend nights were spent at Satyricon or La Luna listening to local and up and coming acts like Everclear, Heatmiser, Hazel, Pond and many others. Since moving to Alaska, it's been a lot harder to keep track of new music and I was mostly stuck with newer releases of the artists I was already familiar with.
iTunes has been a good way to discover new artists and download songs, but I think eMusic and last.fm are even better. eMusic focuses on independent labels and is a subscription service, which means individual tracks are inexpensive enough that I've been downloading entire albums rather than just buying the "best of" tracks from iTunes. Downloads are unprotected MP3 files, encoded at 192 kbps so they're of higher quality than iTunes downloads, and you don't have to burn and rip purchased files to do what you want with them (like convert them to a truly free audio codec like OGG, for example).
last.fm is a free service that keeps track of what music you're listening to (there are plug-ins for most music players, including iTunes), and recommends other artists that other people with similar tastes listen to. What's different than the "people who bought X also bought Y" sort of connections (although it does this too) is that it has a media player (open source!) that plays full length tracks of the music it thinks you will like. This is much better than trying to decide if you like a new artist by listening to a 30-second sample on iTunes or eMusic. And the data is freely available through a well specified web API. You can see what music I'm listening to right now by looking at the 'music' section on the right sidebar of this page.
So now I've discovered a whole range of new artists that I never would have heard without these services. No thanks to the Recording Industry Association of America (RIAA), their old-style business model, and strongarm tactics, I'm finding and funding the artists I want to listen to.