Massive database launched to help build the next generation of music apps

Researchers looking to build technologies that will find their way into the next generation of music apps have received a huge leg up today with the launch of a vast database of information about one million songs by 44,745 artists.

The Million Song Dataset is a collaboration between Columbia University in the USA and music data service The Echo Nest that we previously covered here. Designed for non-commercial use only, the Dataset is an enormous 300GB download containing all sorts of incredibly detailed information about all one million songs.

What kind of information? Everything from basic artist and song data right down to time signatures, keys, pitches, tempos, year of release and a lot more. The idea is that researchers and non-commercial developers can use the information to build and test new services and apps based around music data without having to “reinvent the wheel” by creating a huge database to test them on each time.

What can be done with it?

The kinds of apps the team behind the dataset imagine it will be used for include apps for song recognition, analysing the ‘mood’ of music, being able to recognise what year a song came from just based on the music, artist recognition and cover song recognition. In fact, there’s so much data in here that developers’ imaginations should spur all sorts of interesting uses. No actual audio is included in the download, although it links to 30 second clips of songs for testing purposes.

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

While end users won’t see any direct benefit from the launch of this database immediately, it should speed up the development of new technologies that could very well find their way into the next generation of music apps.

Meanwhile, The Echo Nest’s data is making its way into real products you can use today. We previously covered the MTV Music Meter, while a BBC music recommendation engine recently went live too.

The Million Song Dataset is available to download and dig into now, here.

Story by Martin SFP Bryant

Founder

Martin SFP Bryant is the founder of UK startup newsletter PreSeed Now and technology and media consultancy Big Revolution. He was previously (show all) Martin SFP Bryant is the founder of UK startup newsletter PreSeed Now and technology and media consultancy Big Revolution. He was previously Editor-in-Chief at TNW.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Massive database launched to help build the next generation of music apps

What can be done with it?

Get the TNW newsletter

Trump says Apple will build chips with Intel in the US

Epic is building Claude and Gemini into Unreal Engine 6, whether developers like it or not

Discover TNW All Access

The bottleneck in geothermal moved from the drill to the turbine. This SpaceX alum raised $22M to fix it

Half of Americans now use AI chatbots, but 40% think AI will make society worse and two-thirds don’t trust the government to regulate it