Wednesday, 7 September 2011

Enhanced Voice Service (EVS) Codec for LTE Rel-10

Its been a while we talked about codecs.

The traditional (narrowband) AMR (Adaptive Multi-Rate) codec operates on narrowband 200-3400 Hz signals at variable bit rates in the range of 4.75 to 12.2 kbps. It provides toll quality speech starting at 7.4 kbps, with near-toll quality and better robustness at lower rates and better reproduction of non-speech sounds at higher rates. The AMR-WB (Wideband) codec provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. Couple of years back Orange was in news because they were the first to launch phones that support HD-Voice (AMR-WB).

Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding (transform coded excitation - TCX) additionally to ACELP. This greatly improves the generic audio coding. Automatic switching between transform coding and ACELP provides both good speech and audio quality with moderate bit rates.

As AMR-WB operates at internal sampling rate 12.8 kHz, AMR-WB+ also supports various internal sampling frequencies ranges from 12.8 kHz to 38.4 kHz. AMR-WB uses 16 kHz sampling frequency with a resolution of 14 bits left justified in a 16-bit word. AMR-WB+ uses 16/24/32/48 kHz sampling frequencies with a resolution of 16 bits in a 16-bit word.

Introduction of LTE (Long Term Evolution) brings enhanced quality for 3GPP multimedia services. The high throughput and low latency of LTE enable higher quality media coding than what is possible in UMTS. LTE-specific codecs have not yet been defined but work on them is ongoing in 3GPP. The LTE codecs are expected to improve the basic signal quality, but also to offer new capabilities such as extended audio bandwidth, stereo and multi-channels for voice and higher temporal and spatial resolutions for video. Due to the wide range of functionalities in media coding, LTE gives more flexibility for service provision to cope with heterogeneous terminal capabilities and transmission over heterogeneous network conditions. By adjusting the bit-rate, the computational complexity, and the spatial and temporal resolution of audio and video, transport and rendering can be optimised throughout the media path hence guaranteeing the best possible quality of service.

A feasibility study on Enhanced Voice Service (EVS) for LTE has recently been finalised in 3GPP with the results given in Technical Report 22.813 ‘‘Study of Use Cases and Requirements for Enhanced Voice Codecs in the Evolved Packet System (EPS)”. EVS is intended to provide substantially enhanced voice quality for conversational use, i.e. telephony. Improved transmission efficiency and optimised behaviour in IP environments are further targets. EVS also has potential for quality enhancement for non-voice signals such as music. The EVS study, conducted jointly by 3GPP SA4 (Codec) and SA1 (Services) working groups, identifies recommendations for key characteristics of EVS (system and service requirements, and high level technical requirements on codecs).

The study further proposes the development and standardization of a new EVS codec for LTE to be started. The codec is targeted to be developed by March 2011, in time for 3GPP Release 10.

Fig. above illustrates the concept of EVS. The EVS codec will not replace the existing 3GPP narrowband and wideband codecs AMR and AMR-WB but will provide a complementing high quality codec via the introduction of higher audio bandwidths, in particular super wideband (SWB: 50–14,000 Hz). It will also support narrowband (NB: 200–3400 Hz) and wideband (WB: 50–7000 Hz) and may support fullband audio (FB: 20–20,000 Hz).

More details available in the following whitepapers by Nokia [PDF]:

1 comment:

Anonymous said...

I am having a hard time understanding the Bandwidth switching algorithms in EVS codec (described in TS 26.445) for different modes TBE, FD BWE and MDCT. Can anyone explain me these? Thanks in advance!