Inside Audioshake, using AI to recreate lost stems for hit songs

AI can help place classic songs into new works.

Artificial intelligence is about to change music production. Recently, I wrote about the history of AI for music generation. Not only can AI potentially create new songs from scratch—it can also help decompose existing sound recordings for new creative purposes.

One company working on this is Audioshake. They recently launched an AI-powered product to split a single song recording into stems—separate tracks for vocals and different instruments. Jessica Powell—a former Google executive—cofounded Audioshake in early 2020 with her husband and data scientist Luke Miner. More recently, Fabian-Robert Stöter joined Audioshake as Head of Research.

(I am looking to write about more cool music-tech companies like Audioshake. If you work in music-tech and would like to talk, email me at j@jamesmishra.com.)

(Cover photo credit: AudioShake)


The technology

Audioshake’s technology breaks a recording into separate stems representing the vocals and instruments making up the recording. Below is a video demonstrating how their platform can break down a track into separate stems.

Signal processing researchers have been working on audio source separation for decades. In recent years, neural networks and fast hardware have made tremendous progress on the problem—just like they have in fields like computer vision and natural language processing. WIRED recently wrote about how recent breakthroughs in audio source separation are reshaping music.

There are a handful of other projects and companies in this space:

  • The French streaming site Deezer released Spleeter—an open-source package of software and pre-trained machine learning models for separating music into stems. Anybody can use Spleeter for free, and it has also been integrated into industry-standard audio restoration software like iZotope RX 8 and SpectralLayers.

  • Audionamix is a French software company well-known for their audio source separation product.

  • Fabian-Robert Stöter, Audioshake’s Head of Research, has previously coauthored numerous audio source separation papers and projects, including Open-Unmix—a set of open-source code and pre-trained machine learning models for source separation of pop music.

  • Facebook Research released Demucs, a package of source code and pre-trained models for source separation.

However, Audioshake’s product performs exceptionally well compared to most other products. They are also the first product with the capability to separate an individual guitar into separate stems—a difficult challenge given how similar guitars can sound to other instruments or even vocals.

The market opportunity

Major record labels and music publishers own rights to decades of songs recorded before the spread of digital multitrack recording in the 1990s. These songs are cash cows. Firms like Round Hill and Hipgnosis are spending billions of dollars to acquire catalogs of popular classic songs.

For many of these songs, the stems used to make the recording have been lost to the sands of time. If a song was recorded by a band playing into a single microphone, then the song never had separate stems to begin with.

Many of these songs could earn a small fortune in sync licensing deals if placed into, say, the next blockbuster film—but not if licensees need the stems and the rightsholders don’t have them. Powell estimates that around 30% of inbound sync licensing requests go unfulfilled because the rightsholder can’t deliver the stems for the song that the licensee requested.

Before their public launch, Audioshake has been creating stems for at least fifteen music businesses—including Warner Music Group, Crush Music, Hipgnosis Songs Fund, Downtown Music Services, CD Baby, and peermusic. “In my mind, that's the real testament to [Audioshake]’s quality,” Powell explains. “These people are paying for it and landing movies and commercials with it.”

If you enjoy listening to podcasts about music, you may have already heard some of Audioshake’s work. Earlier this year, Audioshake produced the instrumental used in the Song Exploder episode about Yusef / Cat Stevens. They also produced a vocal stem and instrumental for JP Saxe’s “If the World Is Ending” for the Switched On Pop podcast.

Jessica Powell’s backstory

Early in her career, Powell worked in a communications role at the International Confederation of Societies of Authors and Composers—better known by the French acronym CISAC.

After CISAC, Powell climbed the corporate ladder at Google, spent a year as Chief Marketing Officer at Badoo, and then returned to Google to eventually become Vice President of Communications—reporting to Google CEO Sundar Pichai.

After leaving Google in 2018, Powell published the satirical Silicon Valley novel The Big Disruption—the Silicon Valley satire novel that she penned in 2012 and released in 2018.

Focusing on artists and rightsholders

Powell’s experiences at CISAC and Google gave her a unique insight. Most consumer music tech businesses focus on putting music listeners first—and putting artists and rightsholders last. Social networks and streaming sites tend to focus on copyright enforcement and creator monetization only after reaching product-market fit—and only to comply with the law.

Audioshake is taking the opposite strategy—by building for rightsholders first. “We want to start in a place that is respectful of the creative act,” Powell explains. “That’s why we build for rightsholders.”

But record labels and music publishers aren’t the only ones that can use Audioshake’s technology. Any recording artist working with tricky samples can benefit from source separation. “We’re still thinking about the sixteen-year-old artist in their bedroom,” Powell adds. “Eventually we’d want to serve everybody.”


More Click Track articles