Keep Your Friends Close and Your Microphones Closer

Obviously, microphones need to be close to the person speaking for optimal sound quality. Or do they?

Great audio quality for today’s conference rooms has taken a back seat to room aesthetics. It is common knowledge, or at least it should be, that placing a microphone as close to the person speaking will provide the best sound quality. A few other important factors that come in to play are room acoustics, microphone type, and digital signal processing. Mic placement is certainly top priority and always has been. Yet, there is a growing trend for placing microphones on the ceiling…far, far away.

 

There is another audio component to consider when designing a conferencing space, and it is the deal breaker for ceiling mics. How is the audio being encoded? Audio codecs play an extremely important role in the overall sound quality of a conferencing room, and the type of audio codec used will be the difference between subpar sound and a great experience.

 

What the heck is an audio codec? Basically, without getting into the nuts and bolts, it takes the sound waves that are picked up from the microphones, encodes them, sends them along to the far end, and decodes them for playback. Of course, it’s a little more complicated than that, but for sake of simplicity we will stop there. There are several different kinds of audio codecs, used for several different applications. We will focus on a few common codecs that apply to 99% of all conference rooms that involve audio conferencing.

 

Generally, humans with good ears can hear frequencies between 20Hz and 20kHz. 20Hz is the deep bass that makes the hair on your arms vibrate, whereas 20 kHz is the high-pitched piercing sound. The human hearing range with perception to speech and intelligibility falls between 300 Hz and 3 kHz. This is the frequency range that is most sensitive to human hearing, and this is the range that is used by the standard telephony narrowband codec, G.711.

 

The G.711 audio codec has been the standard for analog telephone transmission since 1972. The G.711 codec passes frequencies from 300 Hz up to 3.4 kHz and uses a sample rate of 8 kHz. Sampling is referred to how much of the original signal is represented or measured at a constant time interval. An 8 kHz sample rate takes 8,000 slices of the waveform every second. These samples are transmitted to the far end, reconstructed, and the listener hears a representation of the original audio wave. 8,000 samples per second might sound like a lot, but it’s not. For comparison, CD quality audio is sampled at 44 kHz, and DVD audio is sampled at 96 kHz. Together with the narrow bandwidth and low sample rate, the G.711 codec provides “toll quality” audio. Okay for a phone handset, not okay for a ceiling microphone.

 

Wideband audio codecs provide much improved audio quality. G.722 is the standard wideband audio codec found in the majority of VoIP systems, and AMR-WB has become the standard for cell phone carriers. Wideband audio provides both a larger frequency range and a better sample rate, 50 Hz through 7 kHz and 16 kHz respectively. The wider frequency response provides a more natural sound, and the 16 kHz sample rate delivers a truer representation of the original sound wave.

 

Wideband audio codes are not a new technology. They have been used in video conferencing forever, going back to ISDN lines. Wideband audio is used for Skype, Zoom, Blujeans, Apple FaceTime, and most other soft conferencing applications. VoIP phone systems offer wideband audio. Over the past few years, wideband audio has even been implemented by cell phone carriers. All the big players now offer wideband audio. The difference in sound quality is noticeable right away, from the first syllable.

Multi-purpose rooms with movable tables are all the rage. Too often, integrators sell the idea to the customer that they can make this multipurpose room a usable audio conferencing space. The integrator will hang some ceiling mic arrays, throw in a nice expensive DSP, and promise the customer the room will be awesome. The room is never awesome. Complaints start to roll in. The far-end users can’t hear. It sounds echoy. It sounds tinny. It sounds hollow. It sounds like the call is going through a soup can with a string attached to it. The audio programmer is called back to site to make some tweaks. There are no tweaks. It can’t be EQ’d. It’s not the gain structure, and it’s not a level adjustment. It’s the codec, and it can’t be fixed. The programmer sits in the room, waiting for divine intervention. In the end, no one is happy.

 

The not-so-funny thing is the same room probably sounds pretty darn good on a video call. Wideband audio is much more forgiving than narrowband audio. Even though the meat of human voice falls into the narrow band frequency range, there are still frequencies and harmonics that are outside the 300 Hz – 3.4 kHz range. These frequencies are really important for overall intelligibility. Frequencies below 300 Hz provide the fullness and depth, and the frequencies above 3.4 kHz are responsible for clarity and brightness. Ever notice how difficult it is to spell words or names over the phone? “F” like Frank, “S” like Sam. The letter “s” and the letter “f” sound nearly identical on a phone call. The hissing, or sibilance, used to make the “ess” sound falls outside the 3.4 kHz frequency range of narrowband audio. That frequency is cut off and is not heard by the far end.

 

Narrowband audio was designed to provide adequate sound using a phone handset. The handset microphone is inches away from the persons mouth. Narrowband audio is not adequate for ceiling mics that are 10 feet from the person speaking. Ceiling mics are just too far away, plain and simple. By the time the sound wave reaches the mic the signal is already degraded due to distance, thanks to the Inverse Square Law. The mic also begins to pick up more reflections as opposed to direct sound. More reflections, degraded signal, low sample rate, narrow bandwidth; a recipe for disaster. Beamforming mic arrays, voice tracking mics, and steerable lobes may help, marginally. The sound still won’t be great. Good luck explaining to the customer why the $3,000 ceiling mic sounds like a bad speakerphone.

 

Even though wideband audio is far superior, there are qualifications that need to be met in order to reap the benefits. All devices in the chain must be compatible. If there is a VoIP system involved, it will need to be configured to use the G.722 audio codec. The far end user must also have G.722 enabled on their VoIP system. If a conferencing bridge is involved, it too must be wideband audio compliant. If there is an analog phone line anywhere in the system, forget about wideband audio. POTS lines do not support it. If a call hits the public switched telephone network (PSTN), sorry, no luck here either. Experiencing HD audio on a cell phone requires a compatible device on both ends as well, and at least 3G service or WIFI. If any devices involved in the call do not meet the parameters, the devices will negotiate down to the lowest common codec, narrowband G.711.

 

It is best practice to stay away from ceiling microphones when used for audio conferencing, regardless of room size or type. The results will more than likely disappoint. It’s impossible to command 100% wideband audio compatibility throughout. It only takes one incompatible device on the call to “dumb” everything down to narrowband audio. Follow these guidelines for a better conferencing experience, a satisfied customer, and most importantly, a much happier audio programmer.


 

 

This article was contributed by our Project Manager, Anthony Ferraro. Anthony has worked with Synergy since 2012, and his calm and “get-it-done” personality makes him a great liaison between all trades on a job site. Anthony is also a Biamp-certified audio programmer whose communication skills expedite and enhance projects large and small. Outside of work, Anthony enjoys spending as much time as he can with his wife and three daughters.