The Echo Nest

The Echo Nest (or Echonest) is not a music site itself but a collector of data about music: artists, tracks (including scanning the music for its properties called Echoprint), music blogs, Twitter, Facebook and the catalogue of streaming applications (project Rosetta). This is then used to power music applications like Spotify or a mediaplayer like Tomahawk.

It is comparable to Last.fm but only as far as the data aspect of it, instead it is set up to work to support others through their API with creating (automated) playlists and providing information about the music. A key difference is that the data on LFM is generated through their users (like the amount of plays, tagging, wikis) or analysed by people on Pandora, but on Echonest the data is collected and analysed by machines. The advantage of this is that Echonest can scan according to standardised settings without duplicate/mis-tagging (like on Last.fm) and in vast detailed amounts (unlike Pandora that has only analysed 9000 tracks so far). The downside is that their machines can't apply contextual logic to their scanning, an example of this would be if a keyword like "Prince" is used: the machine doesn't know if it's the artist or a member of royalty.

Another big difference with other music sites is that there is no complete web-interface (aside from the demos) mostly since none of their users need it. Their data can only be extracted in an XML or JSON-format using the API. However that data can be adjusted to large degrees to provide data with a high quality instead of quantity.


 * Known songs: 34 million
 * Known Artists: 2,374,232
 * Data points: over 1 trillion
 * Music applications using the platform: 423
 * Customers: MTV, BBC, MOG, Spotify

See also: Gracenote

Music Analysis (Echoprint)
Echonest uses their Echoprint for analysing music for (among else): tempo, key, loudness, talkative, danceability and genre. This can be used non-commercial to identify music in external libraries.

Echonest Demos
This is a short list of Echonest demos that are directly available, the source code of most of these can be adjusted to show data on other sites (overview of source code on Github). The complete list of demos can be found here.


 * Artist News This is a basic demonstration web app that shows how you can use the artist news API to get recent news articles for an artist. In this you can see the negative aspect of automation with ambiguous names like Prince.
 * Artist Reviews This is a demonstration web app that shows how you can use the artist reviews API to get recent (external) reviews for an artist.
 * Artist Blogs This is a demonstration web app that shows how you can use the artist blogs API to get recent blog posts for an artist. Notice that these are not just review blogs but also music share blogs.

Project Rosetta
Echonest is working with other music-related companies like Spotify, Deezer, Rdio and Musicbrainz to place data into their project Rosetta. This way it is easy to connect a Musicbrainz-id to a Spotify-id, enabling external apps to get data from more places (mashups).

API
Most websites with a large amount of data provide that to 3rd parties through an API (Application Programming Interface). This way they don't expose their internal database to external use and can provide the data from a cache that is only periodically updated. Many programming languages that power websites or apps (like PHP, C++ and Javascript) can dynamically retrieve this data mostly in the form of XML or JSON.

Usually this data is only accessible by getting an API-key from the site (in this case Echonest) so they know who is asking for the data. Some functionality (like taste profiles) are connected to their specific keys and cannot be retrieved by others. Echonest has also provided a test-key for testing out the API like in the examples used in this article (however these can be disabled by Echonest at any time).

Echonest has extensive documentation available on how to use their API and an option to get a free API-key for non-commercial use: http://developer.echonest.com/

See also: Last.fm API, Reading XML with PHP

Keywords
Using the Echonest API is great for retrieving information but some keywords might benefit from explanation to get to the correct data.


 * Buckets (internal and external) - In order to get extra data about items (like artists) most subsections like biographies, blogs, images or external ids have their own "bucket". These are not required and you can use multiple ones most of the time, except when limiting the output to specific libraries (like only show artists available on Deezer or Spotify). Be careful with this since this can result in huge amounts of data.
 * Familiarity (vs Hotttness ) - A numerical estimation of how familiar an artist currently is to the world. 'Familiarity' corresponds to how well known in artist is. You can look at familiarity as the likelihood that any person selected at random will have heard of the artist. Beatles have a familiarity close to 1, while a band like ‘Hot Rod Shopping Cart’ has a familiarity close to zero. 'Hotttnesss' – this corresponds to how much buzz the artist is getting right now. This is derived from many sources, including mentions on the web, mentions in music blogs, music reviews, play counts, etc.
 * Format (XML vs JSON) - These are the two ways that information can be retrieved. XML is the older way (fairly readable by humans), JSON is the new and faster way (but needs to be optimized first when used by humans).
 * (Taste) Profile - Much like a user profile on Last.fm, used for recommendations and keeping track of music tastes (including number of skips). These are measured in Diversity, Mainstreamness, Freshness and Adventurousness.
 * Results - The number of items to be included in the response.
 * Sort - When multiple items are requested these can be sorted in different orders like: familiarity, hotttnesss, similarity or top_terms.
 * Style, Genre and Mood - Style and Mood were first used by the Echonest, Genre has been introduced recently. Although Mood is pretty easy to understand (happy, sad, complex, simple) Style was a catch-all for anything else, not only genres but also attributes like 00s, 50s and "german music". Using Genre for genres and Style for attributes other then moods has greatly improved the results for artist queries. Note however that genre can not be used for song queries, this still uses style.

Example of how a query is build
This example will search for a list of artists using mood, style and description: http://developer.echonest.com/api/v4/artist/search?api_key=FILDTEOIK2HBORODV&format=xml&description=heavy&mood=sad&genre=jazz&results=75

http://developer.echonest.com/api/v4/ The first section is always the same for doing queries while Echonest is using their version 4 of the API.

artist/search? This indicates that the query is looking for artist(s). The question mark at the end shows that everything after it are variables used for the query. The order of these isn't important, just so long as the required ones are included.

api_key=FILDTEOIK2HBORODV The API-key for the request.

&format=xml The desired format the results should be returned as.

&description=heavy A description of the artist.

&mood=sad A mood like happy or sad.

&genre=jazz A musical genre such as rock, jazz, or dance pop.

&results=75 The number of results desired, if this isn't specified it will default to 15 items. The maximum number of results is 100.

Description/Genre/Style/Mood

 * []
 * A complete list of Styles that you can use in searches

Buckets

 * bucket=id:deezer&limit=true
 * bucket=id:fma&limit=true
 * bucket=id:musicbrainz&limit=true
 * bucket=id:rhapsody-US&limit=true
 * bucket=id:rdio-WW&limit=true
 * bucket=id:spotify-WW&limit=true

Interesting examples

 * Bottom 50 known metal artists
 * Least known (sort=familiarity-asc) metal (style=metal) on Spotify (bucket=id:spotify-WW)
 * Metal bands formed after 2010 on Rdio
 * Get blogposts about artists similar to Radiohead