How Artificial Intelligence Could Help With Accurate ADR Sync And More

August 20, 2019 Mike Thornton

Anyone who has worked on ADR, short for Automatic Dialog Replacement, will know that there is nothing automatic about it. Most of the time ADR is about the skill of the actor, directors and audio tem to get a good take that will match the picture. There are tools that help like VocALign and Revoice Pro from Synchro Arts but what if you could change the video to get sync rather than the audio. Check this out…

Now this a demo by the developers Synthesia and so the left hand example would never get past any ADR engineer, but the right hand image looks remarkable. Normal ADR is not the only thing that this AI powered tech can do.

The CEO of a UK startup pioneering this technology believes that in three years we could have computer-generated versions of actors that are indistinguishable from real humans, with some branding this as ‘deepfake’ technology.

Victor Riparbelli cofounded Synthesia just two years ago has said that the goal is to break into the world of TV and film special effects.

"As the company moves forward we are going to expand our platform and the plan is to start working with film and entertainment and make ideas come to life much [more easily] than they are today."

He went on to explain that the tech Synthesia is developing is the same basic process that is already used in Hollywood films…

"we're just doing it with neural networks which make the process completely automatic."

From this demo, it would seem there are already some great applications in the film and TV industries. Maybe it could signal the end of back-of-the-head shots when the script gets changed post shooting. With this tech, we could get the actor to voice up the new lines and then this AI tech could manipulate the video to make the actors lips match the new script.

Synthesia first generated interest in what they could do when they demonstrated their technology to make a BBC news anchor appear to be speaking Spanish, Mandarin and Hindi by using their artificial intelligence software he suddenly appears to be speaking Spanish, Mandarin and Hindi. The software, first mapped and then manipulated Matthew’s lips to mouth the different languages. BBC Click’s Lara Lewington finds out more…

See this content in the original post

More recently the company applied its tech to soccer legend David Beckham. In collaboration with the campaign Malaria Must Die, Synthesia manipulated Beckham's facial features so that nine malaria survivors were able to speak through him — in nine different languages. Check out this video with David Beckham…

Foreign language dubbing becomes so much easier. Apparently the actual filming on the day was almost identical to a normal shoot. The only difference is ahead of the shoot Synthesia had to train their algorithm to learn Beckham’s face.

To do this, Beckham just had to talk to camera. There was no need for a script, although they sometimes need to suggest topics for people to talk about as they can dry up in front of the camera. To help people think of what to say like ‘what they had for breakfast that morning’.

This footage is then translated using an algorithm to learn how Beckham's face moves and then create a digital model of him. But unlike some systems we understand this process only takes about three to four minutes. CEO Victor Riparbelli explains…

“Once you've done that, it doesn't require any special hardware or cameras or anything like that."

Where Else Could This Technology Be Used?

In recent years there have already been CGI versions of actors and actresses appear in films like "Bladerunner" or "Star Wars: Rogue One." However, these ‘digital actors’ can fall into what's described as the "uncanny valley". What they mean is that the digital actor is too realistic to be cute (like a cartoon), but not realistic enough to be totally convincing. It’s not quite convincing yet. But ‘yet’ would appear to be the operative word here. Victor Riparbelli thinks that with their technology they are close to getting rid of the uncanny valley.

"I think in the next three years we will see a significant improvement in how we can create digital humans," he said.”

He added that Synthesia can already make photorealistic humans,

"we just can't do it with films yet."

What Do You Think?

This technology is also being linked with deep-fake applications, does that worry you? Or does the possibility of much better ADR fits and more realistic foreign language dubbing win the day? Let us know what you think in the comments below…

See this content in the original post