Pages

Join our LinkedIn group

Showing posts with label Codecs. Show all posts
Showing posts with label Codecs. Show all posts

Thursday, 30 October 2014

Codecs and Quality across VoLTE and OTT Networks

Codecs play an important role in our smartphones. Not only are they necessary and must for encoding/decoding the voice packets but they increase the price of our smartphones too.

A $400 smartphone can have as much as $120 in IPR fees. If you notice in the picture above its $10.60 for the H.264 codec. So its important that the new codecs that will come as part of new generation of mobile technology is free, open source or costs very little.


The new standards require a lot of codecs, some for backward compatibility but this can significantly increase the costs. Its important to make sure the new codecs selected are royalty-free or free license.

The focus of this post is a presentation by Amir Zmora from AudioCodecs in the LTE Voice Summit. The presentation below may not be self-explanatory but I have added couple of links at the bottom of the post where he has shared his thoughts. Its worth a read.



A good explanation of Voice enhancement tools as follows (slide 15):

Adaptive Jitter Buffer (AJB) – Almost all devices today (Smartphones, IP phones, gateways, etc.) have built in jitter buffers. Legacy networks (which were LAN focused when designed) usually have older devices with less sophisticated jitter buffers. When designed they didn’t take into account traffic coming in from networks such as Wi-Fi with its frequent retransmissions and 3G with its limited bandwidth, in which the jitter levels are higher than those in wireline networks. Jitter buffers that may have been planned for, say, dozens of msec may now have to deal with peaks of hundreds of msec. Generally, if the SBC has nothing to mediate (assume the codecs are the same and the Ptime is the same on both ends) it just forwards the packets. But the unexpected jitter coming from the wireless network as described above, requires the AJB to take action. And even if the network is well designed to handle jitter, today’s OTT applications via Smart Phones add yet another variable to the equation. There are hundreds of such devices out there, and the audio interfaces of these devices (especially those of the Android phones) create jitter that is passed into the network. For these situations, too, the AJB is necessary.

To overcome this issue, there is a need for a highly advanced Adaptive Jitter Buffer (AJB) built into the SBC that neutralizes the incoming jitter so that it is handled without problem on the other side. The AJB can handle high and variable jitter rates.

Additionally, the AJB needs to work in what is called Tandem scenarios where the incoming and outgoing codec is the same. This scenario requires an efficient solution that will minimize the added delay. AudioCodes has built and patented solutions supporting this scenario.

Transcoding – While the description above discussed the ability to bypass the need to perform transcoding in the Adaptive Jitter Buffer context, there may very well be a need for transcoding between the incoming and outgoing packet streams. Beyond being able to mediate between different codecs on the different networks on either end of the SBC, the SBC can transcode an incoming codec that is less resilient to packet loss (such as narrowband G.729 or wideband G.722) to a more resilient codec (such as Opus). By transcoding to a more resilient codec, the SBC can lower the effects of packet loss. Transcoding can also lower the bandwidth on the network. Additionally, the SBC can transcode from narrowband (8Khz) to wideband (16Khz) (and vice versa) as well as wideband transcoding, where both endpoints support wideband codecs but are not using the same ones. For example, a wireless network may be using the AMR wideband codec while the wireline network on the other side may be using Opus. Had it not been for the SBC, these two networks would have negotiated a common narrowband codec.

Flexible RTP Redundancy – The SBC can also use RTP redundancy in which voice packets are sent several times to ensure they are received. Redundancy is used to balance networks which are characterized by high packet loss burst. While reducing the effect of packet loss, Redundancy increases the bandwidth (and delay). There are ways to get around this bandwidth issue that are supported by the SBC. One way is by sending only partial packet information (not fully redundant packets). The decoder on the receiving side will know how to handle the partial information. This process is called Forward Error Correction (FEC).

Transrating – Transrating is the process of having more voice payload ‘packed’ into a single RTP packet by increasing the packet intervals, thus changing the Packetization Time or Ptime. Ptime is the time represented by the compression of the voice signals into packets, generally at 20 msec intervals. In combining the payloads of two or more packets into one, the Transrating process causes a reduction in the overhead of the IP headers, lowering the bandwidth and reducing the stress on the CPU resources, however, it increases delay. It thus can be used not only to mediate between two end devices using different Ptimes, but also as a means of balancing the network by reducing bandwidth and reducing CPU pressure during traffic peaks.

Quality-based Routing – Another tool used by the SBC is Quality-based routing. The SBC, which is monitoring all the calls on the network all the time, can decide (based on pre-defined thresholds and parameters) to reroute calls over different links that have better quality.

Further reading:


Monday, 22 April 2013

eMBMS rollouts gathering steam in 2013

Its been a while since I last posted something on eMBMS. Its been even longer that we saw anything official from 3GPP on eMBMS. Recently I have seen some operators again starting to wonder if eMBMS makes business sense, while the vendors and standards are still working hard on the technology.

Not so long back, HEVC/H.265 codec was standardised. This codec helps transmission of the video using half the bandwidth. This means that it would be economical to use this for broadcast technologies. No wonder Nokia, Thompson and NTT Docomo are excited.

Interesting picture from a Qualcomm presentation (embedded in the end) shows how different protocols fit in the eMBMS architecture. My guess would be that the HEVC  may be part of the Codecs.



On the operators front, Korea Telecom (KT) has intentions for countrywide rollout. Korea is one of the very few countries where end users have embraced watching video on small form factors. Verizon wireless has already signalled the intention to rollout eMBMS in 2014; its working out a business case. Telenor Sweden is another player to join the band with the intention of adopting Ericsson's Multi screen technology.

One of the main reasons for the lack of support for the 3G MBMS technology was not a compelling business case. Qualcomm has a whitepaper that outlines some of the potential of LTE Broadcast technology here. A picture from this whitepaper on the business case below:

Finally, a presentation from Qualcomm research on eMBMS embedded below:



Tuesday, 18 October 2011

HD Voice - Next step in the evolution of voice communication

Nearly 2 years back I blogged about Orange launching HD Voice via the use of AMR-WB (wideband) codecs. HD voice is already fully developed and standardized technology and has so far been deployed on 32 networks in almost as many countries.

People who have experienced HD voice say it feels like they are talking to a person in the same room. Operators derive 70 percent of their revenue from voice and voice-related services, and studies show that subscribers appreciate the personal nature of voice communication, saying it offers a familiar and emotional connection to another person.

HD voice is also a reaction to the competition faced by the operators from OTT players like Skype.

Below is an embed from the recent whitepaper by Ericsson:



For more information also see:



Wednesday, 7 September 2011

Enhanced Voice Service (EVS) Codec for LTE Rel-10

Its been a while we talked about codecs.

The traditional (narrowband) AMR (Adaptive Multi-Rate) codec operates on narrowband 200-3400 Hz signals at variable bit rates in the range of 4.75 to 12.2 kbps. It provides toll quality speech starting at 7.4 kbps, with near-toll quality and better robustness at lower rates and better reproduction of non-speech sounds at higher rates. The AMR-WB (Wideband) codec provides improved speech quality due to a wider speech bandwidth of 50–7000 Hz compared to narrowband speech coders which in general are optimized for POTS wireline quality of 300–3400 Hz. Couple of years back Orange was in news because they were the first to launch phones that support HD-Voice (AMR-WB).

Extended Adaptive Multi-Rate – Wideband (AMR-WB+) is an audio codec that extends AMR-WB. It adds support for stereo signals and higher sampling rates. Another main improvement is the use of transform coding (transform coded excitation - TCX) additionally to ACELP. This greatly improves the generic audio coding. Automatic switching between transform coding and ACELP provides both good speech and audio quality with moderate bit rates.

As AMR-WB operates at internal sampling rate 12.8 kHz, AMR-WB+ also supports various internal sampling frequencies ranges from 12.8 kHz to 38.4 kHz. AMR-WB uses 16 kHz sampling frequency with a resolution of 14 bits left justified in a 16-bit word. AMR-WB+ uses 16/24/32/48 kHz sampling frequencies with a resolution of 16 bits in a 16-bit word.


Introduction of LTE (Long Term Evolution) brings enhanced quality for 3GPP multimedia services. The high throughput and low latency of LTE enable higher quality media coding than what is possible in UMTS. LTE-specific codecs have not yet been defined but work on them is ongoing in 3GPP. The LTE codecs are expected to improve the basic signal quality, but also to offer new capabilities such as extended audio bandwidth, stereo and multi-channels for voice and higher temporal and spatial resolutions for video. Due to the wide range of functionalities in media coding, LTE gives more flexibility for service provision to cope with heterogeneous terminal capabilities and transmission over heterogeneous network conditions. By adjusting the bit-rate, the computational complexity, and the spatial and temporal resolution of audio and video, transport and rendering can be optimised throughout the media path hence guaranteeing the best possible quality of service.

A feasibility study on Enhanced Voice Service (EVS) for LTE has recently been finalised in 3GPP with the results given in Technical Report 22.813 ‘‘Study of Use Cases and Requirements for Enhanced Voice Codecs in the Evolved Packet System (EPS)”. EVS is intended to provide substantially enhanced voice quality for conversational use, i.e. telephony. Improved transmission efficiency and optimised behaviour in IP environments are further targets. EVS also has potential for quality enhancement for non-voice signals such as music. The EVS study, conducted jointly by 3GPP SA4 (Codec) and SA1 (Services) working groups, identifies recommendations for key characteristics of EVS (system and service requirements, and high level technical requirements on codecs).

The study further proposes the development and standardization of a new EVS codec for LTE to be started. The codec is targeted to be developed by March 2011, in time for 3GPP Release 10.
Fig. above illustrates the concept of EVS. The EVS codec will not replace the existing 3GPP narrowband and wideband codecs AMR and AMR-WB but will provide a complementing high quality codec via the introduction of higher audio bandwidths, in particular super wideband (SWB: 50–14,000 Hz). It will also support narrowband (NB: 200–3400 Hz) and wideband (WB: 50–7000 Hz) and may support fullband audio (FB: 20–20,000 Hz).

Complete detail available in the Nokia whitepaper embedded below:

Tuesday, 23 February 2010

Codec's for LTE

Sometime back I mentioned about Orange launching AMR-WB codec which would result in 'hi-fi quality' voice (even though its being referred to as HD voice by some). Since then, there has been not much progress on this HD-voice issue.

CODEC stands for “COder-DECoder,” but is also known as an enCOder-DECoder and COmpression-DECompression system when used in video systems. Codec's are important as they compress the voice/video data/packets so less bandwidth is required for the data to be transmitted. At the same time it has to be borne in mind that the capacity to withstand errors decrease with higher compression ratio and as a result it may be necessary to change the codecs during the voice/video call. This calls for flexibility as in case of AMR (Adaptive Multi Rate) Codecs.

The following is from Martin Sauter's book "Beyond 3G – Bringing Networks, Terminals and the Web Together":

Voice codecs on higher layers have been designed to cope with packet loss to a certain extent since there is not usually time to wait for a repetition of the data. This is why data from circuit-switched connections is not repeated when it is not received correctly but simply ignored. For IP sessions, doing the same is difficult, since a single session usually carries both real-time services such as voice calls and best-effort services such as Web browsing simultaneously. In UMTS evolution networks, mechanisms such as ‘Secondary PDP contexts’ can be used to separate the real-time data traffic from background or signaling traffic into different streams on the air interface while keeping a single IP address on the mobile device.

UMTS uses the same codecs as GSM. On the air interface users are separated by spreading codes and the resulting data rate is 30–60 kbit/s depending on the spreading factor. Unlike GSM, where timeslots are used for voice calls, voice capacity in UMTS depends less on the raw data rate but more on the amount of transmit power required for each voice call. Users close to the base station require less transmission power in downlink compared with more distant users. To calculate the number of voice calls per UMTS base station, an assumption has to be made about the distribution of users in the area covered by a cell and their reception conditions. In practice, a UMTS base station can carry 60–80 voice calls per sector. A typical three-sector UMTS base station can thus carry around 240 voice calls. As in the GSM example, a UMTS cell also carries data traffic, which reduces the number of simultaneous voice calls.

The following is an extract from 3G Americas white paper, "3GPP Mobile Broadband Innovation Path to 4G: Release 9, Release 10 and Beyond: HSPA+, SAE/LTE and LTE-Advanced,":

Real-time flows (voice/video) based on rate adaptive codecs can dynamically switch between different codec rates. Codec rate adaptation allows an operator to trade off voice/video quality on one side and network capacity (e.g. in terms of the number of accepted VoIP calls), and/or radio coverage on the other side. Operators have requested a standardized solution to control the codec rate adaptation for VoIP over LTE, and a solution has been agreed upon and specified in the 3GPP Rel-9 specifications, which is provided in this paper.

CODEC RATE ADAPTATION BASED ON ECN

Given previous discussion in 3GPP (3GPP S4-070314) it was clear that dropping IP packets was not an acceptable means for the network to trigger a codec rate reduction. Instead an explicit feedback mechanism had to be agreed on by which the network (e.g. the eNodeB) could trigger a codec rate reduction. The mechanism agreed on for 3GPP Rel-9 is the IP-based Explicit Congestion Notification (ECN) specified in an IETF RFC. ECN is a 2 bit field in the end-to-end IP header. It is used as a “congestion pre-warning scheme” by which the network can warn the end points of incipient congestion so that the sending endpoint can decrease its sending rate before the network is forced to drop packets or excessive delay of media occurs. Any ECN-based scheme requires two parts: network behavior and endpoint behavior. The first part had already been fully specified in an IETF RFC106 and merely had to be adopted into the corresponding specifications (3GPP TS 23.401 and 3GPP TS 36.300). The network behavior is completely service and codec agnostic. That is, it works for both IMS and non-IMS based services and for any voice/video codec with rate-adaptation capabilities. The main work in 3GPP focused on the second part: the endpoint behavior. For 3GPP Rel-9, the endpoint behavior has been specified for the Multimedia Telephony Service for IMS (MTSI - 3GPP TS 26.114). It is based on a generic (i.e. non-service specific) behavior for RTP/UDP based endpoints, which is being standardized in the IETF.

Furthermore, it was agreed that no explicit feedback was needed from the network to trigger a codec rate increase. Instead, the Rel-9 solution is based on probing from the endpoints – more precisely the Initial Codec Mode (ICM) scheme that had already been specified in 3GPP Rel-7 (3GPP S4-070314). After the SIP session has been established, the sending side always starts out with a low codec rate. After an initial measurement period and RTCP receiver reports indicating a “good channel,” the sending side will attempt to increase the codec rate. The same procedure is executed after a codec rate reduction.


Figure 6.8 depicts how codec rate reduction works in Rel-9:
  • Step 0. The SIP session is negotiated with the full set of codec rates and independent of network level congestion. The use of ECN has to be negotiated separately for each media stream (e.g. VoIP).
  • Steps 1 and 2. After ECN has been successfully negotiated for a media stream the sender must mark each IP packet as ECN-Capable Transport (ECT). Two different values, 10 and 01, have been defined in an IETF RFC106 to indicate ECT. However, for MTSI only 10 shall be used.
  • Step 3. To free up capacity and allow more VoIP calls and/or to improve VoIP coverage, the eNodeB sets the ECN field to Congestion Experienced (CE) in an IP packet that belongs to an IP flow marked as ECT. Note that the ECN-CE codepoint in an IP packet indicates congestion in the direction in which the IP packets are being sent.
  • Steps 4 and 5. In response to an ECN-CE the receiving MTSI client issues an RTCP message to trigger a codec rate reduction.
Note that ECN operates in both directions (uplink and downlink) entirely independent and without any interactions. It is very well possible to trigger codec rate adaptation in one direction without triggering it in the other direction.

ONGOING WORK IN 3GPP

A new work item called, Enabling Encoder Selection and Rate Adaptation for UTRAN and E-UTRAN, has been created for 3GPP Rel-10. Part of this work item is to extend the scope of the codec rate adaptation solution agreed in Rel-9 to also apply to HSPA and non-voice RTP-based media streams.

Further Reading:

Thursday, 3 December 2009

MBMS and AMR-WB


Nokia publicly underlined its commitment to broadcast-mobile-TV standard DVB-H with the recent unveiling of the mobile TV edition of the Nokia 5330 and its pretax, presubsidy price tag of €155 (US$230), after some in the industry had questioned its enthusiasm for launching new DVB-H devices. Nokia also quelled any suggestions that it might start supporting the MBMS standard with its future device launches.

The price is a massive drop from the €550 price tag carried by Nokia’s last fully DVB-H-compatible handset, the N96, which launched in 3Q08. So the official line from Nokia is this: “All is well on the good ship DVB-H.”

Read more here.

Meanwhile, In China, China Unicom has launched 3G telecom services in 268 cities across the country, said Li Gang, another deputy general manger for Unicom Group, noting that the WCDMA network supports a 14Mbps download data transmission speed and a 7.2Mbps upload data transmission speed.

Notably, the carrier has adopted the most advanced R6 technology in its core WCDMA network to smooth a WCDMA-to-EPS migration in the future, according to Mr. Zhang.

The China Unicom network is expected to support MBMS and HSPA+64QAM technology in the first phase of a further evolution, shore up a HSPA+MIMO technology in the Phase II evolution, and prompt a LTE technology in the Phase III evolution, said Mr. Zhang, adding that the network will present a 100Mbps download speed and a 50Mbps upload speed after the Phase III evolution.

Read more here.
Back in September, Orange Moldova announced the launch of the world's first mobile telephone service offering high-definition (HD) sound. The service will provide customers with a significantly improved quality of service when making calls. Unlike for other mobile technologies such as multimedia capabilities, this is the first time since the 1990s that mobile voice technologies have been subject to a significant evolution.

This is the second step in Orange’s HD voice strategy, following on from the launch of a high-definition voice service for VoIP calls in 2006. Over 500,000 Livephone devices have already been sold in France and the range will be extended to other Orange countries over the coming months.

The first mobile handset integrating high-definition voice capability that will be launched by Orange Moldova is the Nokia 6720c. This innovative handset integrates the new WB-AMR technology, which is widely expected within the industry to become a new standard for mobile voice communications.

Thanks to the Adaptive Multi Rate-WideBand (AMR-WB) codec, double the frequency spectrum will be given over to voice telephony over traditional voice calling. Orange boasts that the result is "near hi-fi quality" and "FM-radio quality", which seems an odd comparison.

Monday, 22 June 2009

Vodafone's Lecture series on Mobile Telecom and Networks


This compilation brings together the transcripts of the third of a series of lectures on the subject of Mobile Telecommunications and Networks. The lecture series was established by The Royal Academy of Engineering and Vodafone to celebrate the enormous social and economic benefits that mobile communications have given us – a success story brought about and maintained by excellence in communications engineering. The three lectures in this series were given over the period between March 2008 and February 2009. All three were exceptionally well attended and generated enthusiastic and lively debate and discussions.

The series was opened by Professor P R Kumar, Franklin W Woeltge, Professor of Electrical and Computer Engineering, at the University of Illinois. His lecture addressed the converging worlds of communications, computation and control – described through infrastructure - free wireless networks, and illustrated with fascinating films of model systems created at the University of Illinois.

The bedrock of all wireless communications systems is radio frequency spectrum, and this was the subject of the second lecture, given by Professor Linda Doyle of the Department of Electronic and Electrical Engineering at the University of Dublin. She entered into the debate of how one should allocate and manage spectrum, covering approaches from classical ‘command and control’ to dynamic ‘grab what you need when you need it’. Her lecture gave rise to the most lively debate we have had during the Questions and Answers sessions.

Most lectures on mobile communications nowadays are concerned with subjects like broadband access, convergence, mobile widgets or the phone as a computing device, so it was refreshing that in the final lecture of the series Professor Peter Vary of the University of Aachen returned to basics – voice communications. In a fascinating lecture he covered the history of and contemporary research on speech coding for mobile communications, providing great insight into what is the most important function of the phone.

The transcript is available here.