Ray Adensamer argues that Voice Quality Enhancement can help deliver the standard required of VOIP conferencing systems
Audio conferencing services based on circuit switched networks and audio bridging equipment have provided hosted conferencing users with a benchmark for pricing and quality in voice communications. While next generation networks based on VoIP technology introduce economic benefits with new feature capabilities for conferencing service providers (CSPs), they also present new technical challenges in maintaining acceptable voice quality. Delivering good voice quality is an important requirement in any VoIP conferencing system, as poor voice quality will increase the costs associated with customer churn, while impacting the bottom line by reducing revenue growth prospects.
Voice Quality Enhancement (VQE) encompasses an integrated set of features designed to overcome common audio quality problems in VoIP conferencing services, including noise, packet loss and echo. A comprehensive VQE solution also measures VoIP quality metrics, which are used in ongoing voice quality measurement associated with service level agreements.
Many features inherent in a VQE solution require sophisticated digital signal processing algorithms. The rapid, scalable execution of these algorithms dictates a product specifically designed for real-time IP packet processing. Fortunately, in a next-generation VoIP audio conferencing architecture, a network element already exists with carrier-class real-time IP packet processing power. And that network element is the IP media server.
The three most common sources of VoIP audio quality problems in a VoIP or IMS network are noise, packet loss and echo. This section discusses each of these VoIP audio quality challenges and describes the conceptual solutions to overcome quality problems.
Gone are the days when people were confined to quiet office and residential environments. Today, with mobile phones and the Internet, people are calling from their cars, airports and from just about anywhere, and these environments are flooding the mouthpiece with all kinds of unwanted sounds that ultimately get onto the call. Making matters worse, callers using laptops and mobile phones are typically saddled with marginal equipment such as low cost earphones and microphones.
This section describes a combination of mechanisms that reduce and help manage the disturbing effect of audio noise: noise gating, noisy line detection and noise reduction.
Noise gating is a simple yet effective mechanism to reduce background noise.
When no speech is detected on a line, its signal is attenuated (eg decreased amplification), which prevents unnecessary noise from being inserted into a VoIP recording or conference mix. Noise attenuation is configurable, so the conferencing application can avoid making the signal unnaturally quiet when the noise gate is applied to an audio signal.
Key benefits of noise gating:
- Reduces background noise using a simple yet effective mechanism
- Supports configurable attenuation
Noisy line detection
There are times on a conference call when some lines are very noisy and disrupt the productivity of the entire call. Noisy line detection measures the noise on audio ports and sends a noisy line notification message to the VoIP application server if a predefined threshold is exceeded, as shown in Figure 2. A second message is sent if the noise subsequently falls below the threshold.
Key benefits of noisy line detection:
- Notifies the application server of noisy line conditions, initiating possible corrective action
- Enables quick remedial action by the application server or the operator (eg mute line)
While a noise gating function described earlier provides a relatively simple solution to eliminating noise when no speech is detected, noise reduction goes a step further by using digital processing techniques to remove the noise and leave the important speech signal intact. This provides benefits in many VoIP applications, such as removing noise from VoIP audio recordings or noisy caller lines in a conference mix.
Key benefits of noise reduction:
- Filters out noise without impacting the speaker's signal
- Reduces noise continuously, whether speech is detected or not
The Internet is an amazing network of interconnected computers, but it's not perfect. The network employs the IP protocol, which does not guarantee packet delivery. Hence, when IP networks get busy or congested, packets can get lost or delayed. While lost packets are not critical for many data applications, packet loss in real-time VoIP services can cause significant audio quality problems. Without special technology to compensate for dropped packets, the result is an abnormal audio signal that might sound ‘choppy.'
Packet loss concealment
Packet loss c\oncealment is a technique for replacing audio from lost or unacceptably delayed packets with a prediction based on previously received audio.
Whereas any voice repair technology would have difficulty recovering from extreme packet loss in abnormal conditions, packet loss concealment is designed to perform intelligent restoration of lost or delayed packets for a large majority of congested network scenarios.
Key benefits of packet loss concealment:
- Softens any breaks in the voice signal
- Reduces the occurrences of choppy audio
An acoustic echo is created when sound emanating from the receiver's speaker (eg handset or speakerphone) is transmitted back by the receiver's microphone. This is depicted in Figure 3, where the Sender (on the left) transmits a signal to the Receiver, and an acoustic echo is created when some speech energy ‘bounces back.' In a VoIP conferencing application, all participants will hear an echo except for the guilty party with the device causing the echo. Since nobody can quickly answer the basic question, "Who's causing the echo?" troubleshooting echo issues in a VoIP conference call can be difficult and frustrating.
Acoustic echo cancellation
Acoustic echo cancellation (AEC) technology is designed to detect and remove the sender's transmit (Tx) audio signal that bounces back through the receive (Rx) path. By removing the echo from the signal, overall speech intelligibility and voice quality is improved.
AEC in a VoIP network is particularly challenging. In a traditional voice network, once a voice circuit is established through the PSTN, the round-trip echo delay is constant. However, in a VoIP network, packet delay is a variable, hence the echo delay is also a variable for the duration of the call, which makes the echo cancellation algorithms in any VoIP quality improvement product more complex and processor-intensive than an equivalent echo cancellation solution in a circuit-switched network.
Key benefits of acoustic echo cancellation:
- Removes a sender's audio echo from the receive path
- Addresses variable packet delay inherent in IP networks
Voice quality metrics
Technology to remove audio quality impairments in a VoIP network is an important part of any solution. But along with the functions to improve VoIP quality, service providers also need a standard, objective way to measure voice quality in order to accurately monitor performance levels and uphold service level agreements (SLAs) with customers.
Voice quality metrics can be divided into three groups: packet, audio and acoustic echo cancellation (AEC). All statistics are captured for each call leg of a conference
call to help with granular troubleshooting of audio quality problems and performance measurement. Packet statistics measure performance with respect to packet throughput, loss and delay, while audio statistics measure speech and noise power levels. AEC statistics measure echo delay and echo cancellation performance.
Key benefits of voice quality metrics:
- Provides objective measurements for administering service level agreements (SLAs)
- Facilitates the troubleshooting of audio quality issues in the network
Voice quality enhancement
Voice quality enhancement (VQE) encompasses an integrated set of features designed to improve VoIP quality and generate statistics needed for ongoing performance monitoring. This requires sophisticated digital signal processing algorithms that perform rapid real-time IP packet processing, a key component in next-generation VoIP audio conferencing architecture. As such, VQE can be deployed in an existing IP media server, which provides the requisite carrier-class real-time IP packet processing power.
IP media servers, also known as the Multimedia Resource Function (MRF) in an IMS architecture, are specifically designed to deliver real-time IP media processing as a common, shared resource for a broad range of VoIP and IMS applications in a next-generation network.
They also deliver real-time processing of codec algorithms, transcoding of codecs and sophisticated audio mixing for conferencing applications. Since media server and VQE tasks are interrelated and require the rapid execution of IP packet processing algorithms, it makes sense to integrate the functions of both into a single network element.
Ray Adensamer is Senior Product Marketing Manager, RadiSys