Chapter 1: Invited Addresses and Tutorials on Signals, Coding,
  Systems and Intelligent Techniques

Title: Recent developments in audio coding

Author(s): Maciej Bartkowiak

Address: Pozna University of Technology, Div. Multimedia Telecommunications and Radioelectronics ul. Piotrowo 3a,60-965 Pozna

Reference: 12th International Workshop on Systems, Signals and Image Processing pp. 45 - 45

Abstract/Summary: A tremendous progress has taken place in audio coding technologies in recent years. Perceptual coding paradigm became ubiquitous among standard and proprietary techniques for compressed audio storage and delivery in applications ranging from narrowband internet streaming of music and speech to high definition multichannel home theatre. Coding scenarios make heavy use of psychoacoustics, especially masking phenomena that render some components of the sound inaudible in the presence of other strong components with similar spectra. Departing from 1st generation basic subband coding scheme with perceptually controlled uniform scalar quantization, modern algorithms of 2nd and 3rd generation have evolved into sophisticated forms of transform coding with switched time and frequency resolution and temporal shaping of quantization noise. Several additional tools have been introduced to model and remove the redundant and perceptually irrelevant components of audio signal, thus increasing the coding efficiency. These tools include variants of prediction in time and frequency domain, perceptual substitution of noise components, diverse quantization schemes (including vector quantization) and refined entropy coding. Although the original 1st and 2nd generation techniques were aimed at near-CD or broadcast quality, there was a strong demand for codecs offering decent quality at reduced bit rate. At very low bit rates however, traditional waveform coding methods are no longer able to hide the quantization noise below the threshold of audibility and coding artefacts become apparent because the masking conditions of the perceptual models are heavily violated. The 4th generation audio codec recently standardized by ISO uses two new model-based tools to describe parametrically the high frequency content and spatial information instead of coding them. Spectral band replication and parametric stereo encoding allow reducing the bit rate required for good quality audio down to a range which have been associated rather with speech coding. Several sound examples will be shown to demonstrate the progress in compression efficiency. Alongside transform coding which is an attempt to approximate the waveform with precisely controlled error, there have been studied parametric coding techniques which describe the signal in terms of parameters of a model that is used to resynthesize a similar signal. Parametric coding offers a perspective of data reduction down to very low number of bits representing only the semantic content, however currently available techniques are only slightly more efficient than stateof-the-art waveform coding. A new standard technique for high quality parametric coding has been recently recommended by MPEG committee. Again, a brief audio demonstration of such approach will be given. Due to popular demands, many new flavours of coding techniques are being or have been recently developed. These include low delay, error resilient, and scalable variants of previous techniques, as well as new lossless and spatial audio coding for multichannel programmes. These shall be briefly covered in the tutorial.

