Production Expert

View Original

Here's Some Audio Software That Will F*ck With Your Head

In this article, we look at audio software which confronts our expectations about what audio software can do and how it does what it does. Here are six which we think illustrate what we mean.

We spend a lot of time looking for ‘better’ when it comes to audio software. Some people even look for ‘best’, though I’m sure there’s no such thing. It was our recent article on plugins which had been around for a decade which for me brought into sharp focus something which I’d been aware of for years. The best of the new stuff is incredibly impressive but there is a category of products, which sits away from any squabbles over which 1176 plugins is most like a real 1176. That category consists of plugins, which just do things that confront our expectations of what audio software can do. Here are our 5.

Melodyne

We’re going to start with Melodyne but possibly not for the reasons you think. Melodyne has been the tuning plugin by which others are judged for at least a decade. It wasn’t the first tuning software available but it has become the go-to for tasteful tuning correction for many years.

OK, you can correct tuning with it. We got used to that years ago. Polyphonic pitch detection and correction was something rather more difficult to accept, as it crossed a line in most people’s understanding of what was and wasn’t possible. Once sounds were mixed together, they couldn’t be unmixed - could they?

However, it is the tempo mapping feature of Melodyne, which we’re going to call out here because we’re so used to the idea that automated processes can handle the easy stuff but throw tricky or ambiguous material at it and you’ll quickly end up doing it manually. We’ve had tools that can tempo map a dance tune with a relentless four to the floor kick pattern for years but Melodyne can tempo map a legato piano part. Don’t you have to be able to understand the music to do that? It seems that Melodyne understands the music.

Spleeter

On the subject of de-mixing music, something which caught our attention in 2019 was Spleeter.

You may not have heard of Spleeter but you have probably heard of some products which use it. Spleeter is the best way we currently have to un-mix stereo material. If you want to isolate vocals from a track, just the bass, just the drums, Deezer is probably behind the tools you’re aware of to do that. In the ‘old days’ of traditional processing, the best we could hope to do was to create the equivalent of a karaoke button as found on some questionable stereo systems of a couple of decades ago. This used a technique called centre channel removal, which exploited the fact that vocals were usually in the centre of the stereo image so by inverting the polarity of one channel and summing both channels into mono, the information in the centre of the mix was removed.

Spleeter is an open-source ‘Source Separation Engine’ released by streaming platform Deezer. It isn’t a plug-in, an application or even a product in the commercial sense, it is intended to help the research community in Music Information Retrieval access the power of state-of-the-art source separation algorithms.

It is available as part of some premium audio software. For example, iZotope’s Music Rebalance is based on Spleeter but presents it with an accessible UI (Spleeter is command line-driven). The results are extraordinary but the artefacts in what is a developing technology limit its usefulness in critical roles in pro audio. Maybe artefact-free de-mixing is going to be, like nuclear fusion, perpetually ‘X years’ away?

Sound Particles Space Controller

Sound Particles Space Controller is easy to understand but it’s so simple and so obvious that it makes me feel like it’s been quietly watching us struggle with surround panners, joysticks and the Dolby Atmos renderer, waiting for us to realise that, like astronomers before Galileo, we’ve been looking at it all wrong. Why would you look at a picture of the room you’re in when you’re already in that room? It sounds obvious but it took Sound Particles to connect the dots and realise that if we want to pan sounds in an immersive format then the quickest and most intuitive way to do it is to just point.

Using a smartphone as a pointing device and creating dynamic pan automation simply by pointing at where the sound should be is simple and clever but possibly more significantly it points the way towards new ways of interacting with audio during productions.

dearVR Spatial Connect is a great example of new ways in which audio can be interacted with. This software is designed to allow immersive audio as used in VR applications to be controlled without having to leave the VR environment, however, using an Oculus Rift style headset is probably a step too far for a typical film mix. This approach from Sound Particles, which brings the space closer to the operator rather than placing the operator in the space, is a brilliant take on how to engage with immersive in 3 dimensions.

Sound Radix Pi

Most of us appreciate the importance of time aligning multiple microphones which are receiving information, which is common to both but is out of time with each other. Those two mics of that guitar cabinet sum both constructively and destructively depending on the distance between them and the notes being played. However constructive and destructive summing, otherwise known as comb filtering due to phase issues, can happen with any sounds that get combined together. The issue is that the neat, near 1:1 relationship between the sounds being combined, no longer exists if the sounds that are summing destructively are the decay of an 808 kick and a synth bass.

This is where Sound Radix Pi and its dynamic phase rotation can solve an issue, which simply isn’t fixable in any other way. Because different unrelated elements of a mix can combine without issue some of the time but on certain notes cause phase cancellations, the issue is a moving target and a static solution like inverting the polarity using a phase button or offsetting a sound using a delay might work but will only work for some of the time.

Static phase rotators, which use all-pass filters to manipulate the phase of signals have existed in hardware and software form for years but Pi from Sound Radix is a dynamic phase rotator and can change the phase relationship between unrelated sounds to keep them in phase with each other even as the notes being played and therefore the phase relationships between the sounds change. It’s really clever and I can’t imagine how it does what it does. Luckily I don’t have to.

Descript Studio Sound

Studio Sound is a feature of the remarkable Descript software, which first came to our attention as a way to transcribe audio to text. While transcription software has existed for some years, Descript performed impressively enough for us to pay attention. Descript is far more than just a transcription service though. It offers a refreshing take on video creation. It can do screen capture and live-action video and a few of the features that seek to use AI to simplify and streamline this process are redefining how this type of work is done.

A great example is the way a video can be edited by editing the text which Descript has transcribed. Remove a section of the video by deleting the relevant text. You can even delete ‘filler words’, meaning you can automatically remove “umms’ and “errs”. There is far more to Descript but the feature which earns Descript a mention this list of audio software is Studio Sound, still in Beta at present this feature uploads your VO to Descript’s servers and does a frighteningly good job of the kind of processing we might apply as a matter of course; EQ and compression, plus some more esoteric processing such as removing ambience. These are all tasks a skilled audio engineer experienced in such tasks can do using some basic, and some specialised, plugins but the fact that Studio Sound does such a good job automatically is truly remarkable. 

iZotope Neutron

The use of AI and machine learning are the things that are going to transform audio production over the next decade and this process is already well underway. We’ve seen hints of what this means for post-production in the example of Studio Sound. To see hints of what this means for music production we could look at iZotope and specifically at Neutron.

Neutron offers assistive audio technology that listens to your track and suggests settings for getting the correct track balance with suggested settings for EQ and compression. A radically different approach to mixing is presented with sounds placed on a rectangle with panning and level corresponding to the XY position with stereo width accessible from the ‘blobs’ you place in the stereo field. For people used to the 20th-century paradigm of music production based around a tape machine and a mixing console, and of course, all the DAW software, which is based on this way of working some aspects of Neutron are a little shocking.

The idea that the initial track balance, EQ and compression can be created automatically through iZotope’s assertive technology makes many engineers uncomfortable. There is a frequent criticism that such approaches, where a large data set of recorded music is used as a basis for comparison and the software tries to make the unmixed track more like that typical fingerprint of recorded music will squeeze out originality and cause a convergence to a point where all music sounds the same.

This is true but it is arguably exactly what the majority of people want from a mix. How many times have you heard the phrase “sounds like a record” being used as a measure of success? Most people want their music to sound like a ‘record’ and that has to some extent mean sounding like everything else.

In Conclusion

Audio software can occasionally feel like lots of versions of the same thing being endlessly repackaged. Wry comments about yet another 1176 plugin abound but on closer examination, there are a lot of contenders for inclusion on a list of truly innovative software that does things, which confront our expectations and sometimes confound them!

What would you have included in this list?

See this gallery in the original post