Scientists have developed an AI that can generate music from silent piano performances, just by watching the movements of the player’s hands.
The system, called Audeo, analyzes top-down videos of someone tickling the ivories to predict which keys are being pressed in each frame. It then produces a transcript of the music, which a synthesizer translates into sound.
The researchers trained and tested the AI on footage of pianist Paul Barton playing tunes by famous composers.
They then evaluated the accuracy of the Audeo’s compositions by playing them to music-recognition apps, such as Shazam and SoundHound.
The apps identified the tune 86% of the time — just 7% less than they recognized the source videos.
Senior study author Eli Shlizerman, an assistant professor at the University of Washington, said he was surprised by the quality of the AI’s output:
To create music that sounds like it could be played in a musical performance was previously believed to be impossible. An algorithm needs to figure out the cues, or ‘features’ in the video frames that are related to generating music, and it needs to ‘imagine’ the sound that’s happening in between the video frames. It requires a system that is both precise and imaginative.
You can judge its performances for yourself in the video below:
The researchers have also explored using Audeo to change the style of music. Shlizerman said the system could show how music produced by a piano sounds when played through a trumpet.
He hopes the research will enable new ways for people to interact with music:
For example, one future application is that Audeo can be extended to a virtual piano with a camera recording just a person’s hands. Also, by placing a camera on top of a real piano, Audeo could potentially assist in new ways of teaching students how to play.
You can read the full study paper here.
Published February 5, 2021 — 17:37 UTC