Stem Separation From Video Is Here - We Try It

November 29, 2022 Luke Goddard

With the recent emergence of a number of audio separation tools, one of the latest lays claim to the next hurdle to be cleared - audio stem separation from video. With various uses abound, we try it. See and hear it in action for yourself.

Un-mixing - The Story So Far

Not so long ago, ‘un-mixing’ audio was the stuff of science fiction. Taking a simple example of two channels (A + B) mixed down to a single result (C), any single element can be computed when the other two are known, as in any equation. To switch things around, C - B = A, or C - A = B. The point here is that everything is fine until someone asks for A and B to be separated when only C is available! In recent years, developers have managed to achieve the impossible; that is to reverse-engineer summed audio into its constituent parts. Tools that started out as rudimentary re-balancing tools (on the outside at least) emerging in the 2010s have evolved into sophisticated algorithm based track and stem generating marvels.

Applying Un-Mixing

Assuming copyright clearance, the uses for these tools are immediately apparent. Musicians and creators can finally get their hands on that guitar or drum part they’ve been trying to nail. Further into the professional realm, music editors will always be glad of having access to instant instrumental versions, whereas the mastering engineer will be afforded options previously unavailable.

LALAL.AI

One of the latest solutions offering deconstructive magic is LALAL.AI. This tool offers the audio separation that various engineers have been familiarising themselves with more recently, with this one offering its own spin. Perhaps most noticeably, LALAL.AI has the user upload assets into the company’s servers for processing. The new extracted assets are then downloaded by the user via links emailed to them. Not only that, but the service will also ingest video files for audio stem separation. Deliverables are returned in the same format as the upload, with the service supporting AVI, MP4, MOV, and MKV video formats.

The latest incarnation of LALAL.AI utilises Phoenix AI, a new neural network that isolates stems faster than before and provides and is claimed to provide better vocal separation quality than all other AI-based stem splitters on the market. So confident are they in their tech, that LALAL.AI have published data showing how well theirs fares against a different service. You can read more about Phoenix in great detail here, covering both its conception and some of it’s MO.

Watch in the video as we use LALAL.AI to extract audio stems and tracks from a video file. We upload into its simple web-based UI before following the generated links to download the extracted audio. We then bring these in alongside the original video to see how their quality compares with its original audio.

More from LALAL.AI

A next-generation vocal remover and music source separation service for fast, easy and precise stem extraction. Remove vocal, instrumental, drums, bass, piano, electric guitar, acoustic guitar, and synthesizer tracks without quality loss.
Standard Volume offers a free way to try the service. Upgrade to process more files and get results faster. Available only for individual use.
High Volume allows the processing of thousands of minutes’ worth of audio and video. Suitable both for individual and business use.

For a limited time, LALAL.AI is running a deal available any day between November 21 and December 4th inclusive.

LALAL.AI is offering a 50% discount on the Plus pack AND extra 50 minutes free until 4th December 2022. This equates to $25 for 350 minutes you can spend splitting audio and video files into stems of vocals, instrumental, drums, bass, acoustic guitar, electric guitar, piano, and synth.

Photo by Adi Goldstein on Unsplash

See this gallery in the original post