Researchers looking to build technologies that will find their way into the next generation of music apps have received a huge leg up today with the launch of a vast database of information about one million songs by 44,745 artists.
The Million Song Dataset is a collaboration between Columbia University in the USA and music data service The Echo Nest that we previously covered here. Designed for non-commercial use only, the Dataset is an enormous 300GB download containing all sorts of incredibly detailed information about all one million songs.
What kind of information? Everything from basic artist and song data right down to time signatures, keys, pitches, tempos, year of release and a lot more. The idea is that researchers and non-commercial developers can use the information to build and test new services and apps based around music data without having to “reinvent the wheel” by creating a huge database to test them on each time.
What can be done with it?
The kinds of apps the team behind the dataset imagine it will be used for include apps for song recognition, analysing the ‘mood’ of music, being able to recognise what year a song came from just based on the music, artist recognition and cover song recognition. In fact, there’s so much data in here that developers’ imaginations should spur all sorts of interesting uses. No actual audio is included in the download, although it links to 30 second clips of songs for testing purposes.
While end users won’t see any direct benefit from the launch of this database immediately, it should speed up the development of new technologies that could very well find their way into the next generation of music apps.
The Million Song Dataset is available to download and dig into now, here.