2002 IntroductiontoDigitalAudioCodin

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Digital Audio, Audio Coding.

Notes

Cited By

Quotes

Foreword

Leonardo Chiariglione - Telecom Italia Lab, Italy

Analogue speech in electrical form has a history going back more than a century and a quarter to the early days of the telephone. However, interest in digital speech only gathered momentum so.me 40 years ago when the telecommunications industry started a global project to digitize the telephone network. The technology trade-off of the time in this infrastructure-driven project led to a preference for adding transmission capacity over finding methods to reduce the bitrate of the speech signal so the use of compression technology for speech remained largely dormant. When in the late 1980s the ITU-T standard for visual telephony became available enabling compression of video by a factor of 3,000, the only audio format in use to accompany this highly compressed video was standard telephone quality 64 kb/s PCM. It was only where transmission capacity was a scarce asset, like in the access portion of radiotelephony, that speech compression became a useful tool.

Analogue sound in electrical form has a history going back only slightly more than a century ago when a recording industry began to spring up around the gramophone and other early phonographs. The older among us fondly remember collections of long playing records (LPs) which later gave way to cassette tapes as the primary media for analogue consumer audio. Interest in digital audio received a boost some 20 years ago when the Consumer Electronics (CE) industry developed a new digital audio recording medium: a 12 cm platter - the compact disc (CD) - carrying the equivalent of 70 minutes of uncompressed stereo digital audio. This equivalent of one long playing (LP) record was all that the CE industry needed at the time and compression was disregarded as the audio industry digitized.

Setting aside some company and consortium initiatives, it was only with the MPEG-l project in the late 1980s that compressed digital audio came to the stage. MPEG-l had the ambitious target of developing a single standard addressing multiple application domains: the digital version of the old compact cassette, digital audio broadcasting, audio accompanying digital video in interactive applications, the audio component of digital television and professional applications were listed as the most important.

The complexity of the task was augmented by the fact that each of these applications was targeted to specific industries and sectors of those industries, each with their own concerns when it comes to converting a technology into a product. The digital version of the old compact cassette was the most demanding: quality of compressed audio had to be good, but the device had to be cheap; in digital audio broadcasting quality was at premium, but the device had to have an affordable price; audio in interactive audio-visual applications could rely on an anticipated mass market where a high level of silicon integration of all decompression functionalities could be achieved; a similar target existed for audio in digital television; lastly, many professional applications required the best quality possible at the lowest possible bitrates.

It could be anticipated that these conflicting requirements would make the task arduous, and indeed the task turned out to be so. But the Audio group of MPEG, in addition to being highly competitive, was also inventive. Without calling them so, the Audio group was the first to define what are now known as "profiles" under the name of "layers". And quite good profiles they turned out to be because a Layer I bitstream could be decoded by a Layer II and a Layer III decoder in addition to its own decoder, and a Layer II bitstream could be decoded by a Layer III decoder in addition to its own decoder.

The MPEG-2 Audio project later targeted multichannel audio, but the story was a complicated one. With MPEG-l Audio providing transparent quality at 256 kb/s for a stereo signal with Layer II coding and the same quality at 192 kb/s with Layer III coding, it looked like a natural choice that MPEG-2 Audio should be backwards compatible, in the sense that an MPEG-I Audio decoder of a given layer should be able to decode the stereo component of an MPEG-2 Audio bitstream. But it is a well-known fact that backwards compatible coding provides substantially lower quality compared to unconstrained coding. This was the origin of the bifurcation of the multichannel audio coding work: Part 3 of MPEG-2 specifies a backward compatible multichannel audio coding and Part 7 of MPEG-2 (called Advanced Audio Coding - AAC) a non backward compatible or unconstrained multichannel audio coding standard. AAC has been a major achievement. In less than 5 years after approving MPEG-1 Audio layer III, the MPEG Audio group produced an audio compression standard that offered transparency of stereo audio down to 128 kb/s.

This book has been written by the very person who led the MPEG-2 AAC development. It covers a gap that existed so far by offering both precious information on digital audio in general and in-depth information on the principles and practice of the 3 audio coding standards MPEG-1, MPEG- 2 and MPEG-4. Its reading is a must for all those who want to know more, for curiosity or professional needs, about audio compression, a technology that has led mankind to a new relationship with the media.

Table of Contents

       Pages 3-12
       Introduction
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 13-46
       Quantization
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 47-73
       Representation of Audio Signals
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 75-102
       Time to Frequency Mapping Part I: The PQMF
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Pages 103-147
       Time to Frequency Mapping Part II: The MDCT
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 149-177
       Introduction to Psychoacoustics
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 179-200
       Psychoacoustic Models for Audio Coding
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 201-220
       Bit Allocation Strategies
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 221-235
       Building a Perceptual Audio Coder
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 237-261
       Quality Measurement of Perceptual Audio Codecs
       Marina Bosi, Richard E. Goldberg
       Pages 263-263
       Download PDF (29KB)
       Pages 265-313
       MPEG-1 Audio
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 315-332
       MPEG-2 Audio
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 333-369
       MPEG-2 AAC
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 371-400
       Dolby AC-3
       Marina Bosi, Richard E. Goldberg
       Look Inside Get Access
       Chapter
       Pages 401-430
       MPEG-4 Audio
       Marina Bosi, Richard E. Goldberg

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2002 IntroductiontoDigitalAudioCodinMarina Bosi
Richard E. Goldberg
Introduction to Digital Audio Coding and Standards10.1007/978-1-4615-0327-92002