What We Learned Using the Spotify Web API
10 minute read
Andy Shanks profile photo
Andy Shanks
There are 1,400 genres on Spotify!

And this is evolving all the time as new genres are regularly created to cover the huge mass of new music that is added. Spotify got creative naming these genres; some of our favourite finds were 'stomp and holler', 'shimmer pop' and 'vapour twitch'.

Find out what you listen to here via the interactive we built for The Guardian and Spotify

API Background

Spotify acquired The Echo Nest (a music intelligence platform) and, as of March 29th 2016, stopped producing API keys for The Echo Nest. This merged many features into the main Spotify API, which now offers 3 sections of features;

  • Audio Features – Access to audio information about tracks (energy, danceability, tempo, loudness, etc. as well as full detailed audio analysis)

  • Recommendations – A recommendation API (to generate a list of tracks given a starting point from a seed artist, track or genre), with filtering capabilities to restrict the set of tracks that are returned.

  • User's Top Artists, Tracks – Top-level user information such as top artists and tracks.
What we needed to do

Ultimately we needed to show a visual representation of a user's musical DNA so needed to;

  • Find out an accurate idea of what genres of music people listen to most to be able to define their musical DNA

  • Identify all Spotify genres and sub-genres

  • Display a description of each of the 1,400 genres

  • Display a playlist featuring a good selection of tracks representing a specific genre
How we did it

While the original Necho Est API had a lot of good genre information on artists, tracks and albums, we initially discovered the new Spotify API was very sparsely populated. In order to extract enough genres to make our interactive work, we had to use some creative API processing.

We started by accessing the user's top tracks from the longest period of time that Spotify provides, giving us as many tracks as possible. We also accessed tracks from discover weekly but weighted them lower than top tracks – discover weekly is just a suggestion (based on previous listening) while top tracks actually show what people have been listening to.

Very few of the tracks had genres, but we learned that artists are more likely to have genres attached to them (as will albums sometimes). We gathered genres from all tracks, from all albums where these tracks appear, and all artists that made those tracks. We mashed all of those together using some weighting algorithms (top tracks have priority etc.). Our results would have been better if Spotify put more genres into their database – for example, even big bands like Muse and U2 are missing genres on their tracks.

Another a challenge was to get 1,400 genre descriptions. We were able to find about 900 genres from an old EchoNest API endpoint that was still functioning as part of a demo app. We used Wikipedia and the urban dictionary to find the rest. Some we simply had to write ourselves. We then categorised the genres into master and sub genres to be able to allocate colours to each master genre in order to visualise a user’s musical DNA.

Once we had displayed the genres we were able to find Spotify created playlists (The Sounds of Spotify) for each genre, meaning a user could follow or discover new music within that genre.

Challenges

There's alot of information available via the Spotify API but for this project we came across some challenges;

  • There is rate limiting on the Spotify API but we were able to arrange an increase for our app with Spotify

  • We optimized all of our queries into grouped queries to lower the number of requests - as we had to get related artists and albums for each track, this was necessary to get good performance and not trigger rate limiting

  • Often we would only get 10 genres back after requesting genres from 50 tracks (while also adding genres from all albums and artists linked to those tracks)

  • Often the API would return unexpected results, so we had to write tests for every possible scenario (requesting top tracks should return 50 tracks when specified but often only by 20-30 tracks were returned)

  • Many artists and tracks don’t have genres assigned

  • Many artists have multiple genres as they span music types and no weight on them (e.g. if a classic hard rock band makes one experimental track in another genre, it appears that they are interpreted as 50% each)

  • There are no genre descriptions available in the Spotify API

  • There is no data available on how the Spotify community as a whole consume music

A key thing to note here is that all these services require a starting point, a user, a track or a genre for example. This acts as a starting point to return more information or recommendations.