Production Expert

View Original

Ambisonic Formats Explained - What Is The Difference Between A Format And B Format

See this content in the original post

In this article, we are going to look at some terminology associated with Ambisonics without going as far as workflow or specific tools and plug-ins. Ambisonics can be confusing for those new to it and rather than diving straight into how to work with Ambisonics content, we are going to look at what we get when presented with Ambisonics assets. It’s important to understand what you have before you start considering how you might use it.

Ambisonics has been around a long time but has gained relevance it didn’t previously enjoy courtesy of the interest in immersive VR, gaming, and 360-degree video content, particularly since gaining support from Facebook and YouTube. A new generation of tools, which can decode Ambisonics content to binaural for monitoring via headphones, and the advent of head tracking technology such as Waves NX have solved the previous barriers to properly monitoring and consuming ambisonic content.

There are higher-order Ambisonics formats, which use more channels to give better localisation but the most basic form, first-order Ambisonics, gives full spherical capture with 4 channels and can be captured using a tetrahedral mic using 4, closely spaced capsules. Second-order Ambisonics uses 9 channels, third-order Ambisonics uses 16 channels. An example of third-order Ambisonics mic would be the Zylia microphone system. Pro Tools has had support for Ambisonic formats up to 3rd order since Pro Tools 12.8.

If you want to know how to use Ambisonics in Pro Tools then these tutorials from Avid’s Simon Sherbourne are an excellent place to start If what you need is some orientation in terms of what Ambisonics is, how it is captured and what kind of assets you might have to deal with then read on.

Ambisonics Mics - A Format

Tetrahedral Mic Capsule Array

A tetrahedral mic is a first-order Ambisonics mic. It captures four channels of audio, one for each capsule. The output from such a mic is referred to as A format. Because the spacing between capsules isn’t standardised this raw output of the mic, the A format signal, isn’t suitable as an interchange format because it is specific to each model of microphone. Instead, A format is encoded to B format and this is used as an interchange format, though unfortunately, even this isn’t totally standardised!

B Format

First-order B format, somewhat confusingly, also uses 4 channels just like A format but B format, while derived from A format, isn’t the same as A format. It renders the raw output of the capsules to a perfectly aligned set of signals, so the spacing between the capsules is compensated for and the sound is recorded as if from a virtual point in space, and the four channels represent the outputs of four virtual microphones. The clever bit is that by manipulating the relative levels of these four channels, the output of any combination of first-order microphones pointing in any direction can be created. You can choose the orientation and polar pattern of your microphone “after the fact”. In fact, multiple microphones can be created to create virtual stereo arrays or more.

Ambisonics As A Format Agnostic Interchange Format

Ambisonics is the standard for 360 video content, it captures a full spherical sound field to as few as four channels and while it can be used for surround content, it is not the same as any of the surround formats in that is is not specific to any particular speaker format (5.1, 7.1 etc).

Ambisonics has a few distinct advantages over traditional surround formats and as such is a popular “format agnostic” interchange format for audio. It is for this reason that Ambisonics sound effects libraries are popular. They can be decoded to 5.1, 7.1, stereo, Dolby Atmos or anything else because the full spherical sound field is available and to create any of these formats is just a matter of excluding some of that information.

B format works a bit like MS, you can do MS using a fig 8 and an omni, rather than the more usual cardioid and this would be just like the W and Y channels of B format. MS using an omni, instead of a cardioid, results in an array similar to two back-to-back cardioids facing left and right because using a cardioid in place of an omni, changes the response closer to an XY pair of cardioids.

B format’s four channels are analogous to four virtual microphones, the technical explanation of how these channels are derived isn’t necessary here and presenting them in terms of their virtual microphones equivalents is so familiar to audio people that discussing exactly how things work probably isn’t necessary. For example, the W channel is based purely on amplitude. An alternative way to think of that is that level without reference to direction is as the output of a virtual omni mic.

The four channels B format can be thought of as representing the output of:

  • W is an omnidirectional polar pattern, containing all sounds in the sphere, coming from all directions at equal gain and phase.

  • X is a figure-8 bi-directional polar pattern pointing forward.

  • Y is a figure-8 bi-directional polar pattern pointing to the left.

  • Z is a figure-8 bi-directional polar pattern pointing up.

How Does the Sennheiser Ambeo Mic Create B Format?

As an example, the Sennheiser Ambeo mic has 4 matched KE14 cardioid capsules. These are the same capsules as are found in the premium e914 pencil condenser.

These capsules are named:

  • FLU - front-left-up

  • FRD - front-right-down

  • BLD - back-left-down

  • BRU - back-right-up

These A format channels are combined in the following ways to create the B format signal:

  • W = FLU + FRD + BLD + BRU

  • X = FLU + FRD – BLD – BRU

  • Y = FLU – FRD + BLD – BRU

  • Z = FLU – FRD – BLD + BRU

It looks unfriendly but if you understand how Mid/Side works it becomes pretty straightforward to see how the different polar responses can be derived.

A criticism, which has been made of first-order ambisonics mics, is the tetrahedral array of capsules shares a potential weakness with the XY array of coincident cardioids, in that sources directly in front of an ambisonics mic are off-axis to all the capsules. An advantage of a cardioid/Fig 8 MS array compared to using an XY pair of coincident cardioids is that the M mic is on-axis to the centre of the stereo image whereas the centre is off-axis to both the left and right mics in an XY array.

Whether or not you see this as a potential weakness very much depends on how important you see the tetrahedral array’s lack of front bias to be. It doesn’t favour sounds arriving from any direction over any other. It should also be remembered that tetrahedral arrays use small-diaphragm capsules because these can be positioned as close as possible to each other so as to act as a virtual capture point, and because small diaphragm mics tend to have a better off-axis response.

A reason why Ambisonics mics have tended to be relatively costly is that small-diaphragm condenser or electret mics have higher self-noise than their large-diaphragm equivalents, using four capsules results in four times the cost and four times the self-noise so higher quality components are needed to keep the noise down to acceptable levels, further adding to cost.

B Format Channel Order

As has already been referred to, B format is a standard interchange format but unfortunately, it isn’t quite as standard as it might me. This is because there are two alternative B formats and they aren’t compatible. They are AmbiX and FuMa. While they contain the same channels, the order in which they are presented is different as is the relative level of the W channel compared to the X, Y and Z channels. This gain difference is referred to as a “Normalisation Standard”.

  • Furse-Malham standard (FuMa) is an older standard, it is still supported by plug-ins and processing tools. It has a channel order of W, X, Y, Z and the W channel is attenuated by 3dB relative to the other channels.

  • AmbiX is more modern and has been widely adopted by platforms such as YouTube. It uses ACN ordering (W, Y, Z, X) and SN3D normalisation (i.e. identical gain across all channels)

Whichever the format you deliver your Ambisonics file in, it is vital to keep track of the standards you are using. Otherwise information will end up in the wrong position.

We hope this primer on Ambisonics, A format and B format has cleared up any questions you might have. Ambisonics isn’t difficult but it is different and to many, unfamiliar. As with so many things, if you understand the building blocks then the complex workflows into which Ambisonics can be incorporated become easier to understand.

See this gallery in the original post