In Summary
AI can seem like a catch-all term for any Magic Plugin that does great things in a mysterious way. With an increasing number of products riding the same wave, we ask whether AI label is always justified.
Going Deeper
The last few years have seen an exponential growth both in the use of AI technology and its place in the public awareness. It has been gaining traction in learning assistance and speech recognition/synthesis, medicine and healthcare such as in drug design and biotechnology, as well as in industry such as for the development of nanomaterials, nuclear fusion and quantum chemistry.
Its use in other sectors can raise genuine life-and-death ethical questions, but in our industry our questions are just as valid. Who ends up getting less work as a result of AI’s existence is a common concern, but on this side-note, the good news may well be that AI actually lets more people do more kinds of work.
Most audio pros will use the tool that does the job well with the least hassle. There are those who like to know what’s happening under the hood with any tool, while others are happy to press the button and move on. Both camps can benefit from AI tech, but where it runs the risk of morphing into a reductive buzzword, some will say that any artificial intelligence label needs to be justified or at least qualified in some way.
In an industry that sees its fair share of marketing hype, is the AI moniker being overused? While the actual workings of this tech are far beyond the scope of this article, what we can ask more generally is what AI can and can’t do currently to make more informed decisions when choosing products. If you’d like to know just enough to leave the rest to the Machines, or for those times for when the marketing threatens to set off your studio’s BS alarm read on…
Defining AI
Intelligence itself has been defined as a process that takes unstructured information to turn it into useful knowledge. Most know that AI stands for Artificial Intelligence, but some might be surprised to learn that even those who fully understand the tech concede that there is no exact definition of AI. That might come as music to the ears of anyone who wants to associate their product with its power!
In the video above, Demis Hassabis, the British neuroscientist, co-founder and CEO of DeepMind, and world authority on artificial intelligence pins down a concise working explanation of AI:
AI is the science of making machines smart.
How Does It Do It?
Setting aside the mechanics of computer code, in highly simplified terms an AI system can consist of an Agent (the AI itself) and an Environment (the Problem). The Agent continually observes its Environment, and takes an Action towards solving that problem. The process repeats in a loop, and this can be thought of as Learning. This is analogous to the way humans think.
Machine Learning (ML) is one ingredient that sits under the umbrella term of Artificial Intelligence. It can sit alongside other elements of AI such as deep learning, robotics, or natural language processing. According to Google’s own learning pages:
While AI and ML are not quite the same thing, they are closely connected. AI is the broader concept of enabling a machine or system to sense, reason, act, or adapt like a human. ML is an application of AI that allows machines to extract knowledge from data and learn from it autonomously.
What AI Can Do
In the context of AI, a Problem is defined as a numerical task to solve (rather than just a PITA like going to the dentist or trashing your DAW’s preferences). According to Hassabis the three main types of problem well suited to AI are:
Problems with lots of data to crunch.
Those with data sets from different disciplines, such as Acoustics and Electronics.
Problems with a clear numerical objective to problem-solve ‘against’, such as ”Find and match to the current loudness spec for Netflix”.
How do all three points relate to audio? Well, with the vast majority of signals existing digitally there is plenty of lovely pure data for AI to bite down on. Most industries are multi-discipline to varying degrees, and audio also has a healthy compliment of physics, electronics, and DSP related tasks at which to point The Machine.
Many factors in audio engineering can be quantified, so it goes that any task needing a numerical result such as “listen to the second chorus only and turn down any vocal peaks at 2kHz above -12dBFS down by 3dB” are achievable with AI. Once it has learned what is most likely to be the chorus (from data such as patterns in musical convention), the rest is ‘just’ numbers relating to level and frequency.
What AI Can’t Do (Yet)
In any application (audio or otherwise), a lack of data, or too narrow a database, will restrict the success of an ML or AI. Moreover, some things are still almost impossible to quantify in the first place. One person’s Make It Good button will be entirely different to another’s, even within genres. Surveying engineers’ tastes with a score could be used to quantify it, but the scores could end up being even anyway. “Listen to the second chorus only and turn down any vocal peaks at 2kHz above -12dBFS down by 3dB” might have an artistic result, but the action is purely technical and the command human-led.
In short, taste is very hard to put a number on.
So Is AI Just A Marketing Tool?
With interest around AI in many industries showing no signs of diminishing, audio products are also riding a wave of innovation. Sometimes this will be merely ‘intelligent’ and at other times AI-driven by definition.
It’s been said that audio engineering is the place where art and science go to socialise. The science part of things is easy enough to imagine rubbing shoulders with something as technical as AI. That said, the subject of Art has plenty of Quality, with the question of its Quantity being harder to pin down. When it comes to the crunch, for now, AI may be able to get objective results from the numbers, but right now it doesn’t actually know what its doing. By extension, it cannot listen.
Any tool that uses a learning process that feeds the output back to the input to improve itself can rightly claim to be driven by ML. The AI label, however, suggests much a wider MO taking in extra tech such as such as deep learning, neural networks, or natural language processing. Therefore any messaging that contains ML is likely to be more than just a marketing tool. Those brands flashing the AI badge in place of ML may be using a technicality to their advantage. No-one’s suggesting wilful misinformation by brands, but it’s safe to say that this area is where the enormous chasm between devs and marketing departments is most stark.
When intellectual property is so precious, we can’t reasonably expect developers to expose their secrets. All we can know is that some tools do indeed use elements of AI that can include its ML puzzle-piece. Whatever the means, artistic merit still is well and truly on the shoulders of humans. Luckily, all we need to keep doing is listening and learning.
A Word About This Article
As the Experts team considered how we could better help the community we thought that some of you are time poor and don’t have the time to read a long article or a watch a long video. In 2023 we are going to be trying out articles that have the fast takeaway right at the start and then an opportunity to go deeper if you wish. Let us know if you like this idea in the comments.
Studio photo by Techivation
Data photo by Markus Spiske on Unsplash
Console photo by Samuel Spagl on Unsplash
Chart Photo by Lukas