Speech and Audio Signal Processing - Ben Gold, Nelson Morgan, Dan Ellis

- DE
- FR

E-Book (pdf) Speech and Audio Signal Processing von Ben Gold, Nelson Morgan, Dan Ellis

Speech and Audio Signal Processing Ben Gold, Nelson Morgan, Dan Ellis E-Books Englisch

When Speech and Audio Signal Processing published in 1999,
it stood out from its competition in its breadth of coverage and
its accessible, intutiont-based style. This book was aimed at
individual students and engineers excited about the broad span of
audio processing and curious to understand the available
techniques. Since then, with the advent of the iPod in 2001,
the field of digital audio and music has exploded, leading to a
much greater interest in the technical aspects of audio
processing.

This Second Edition will update and revise the original
book to augment it with new material describing both the enabling
technologies of digital music distribution (most significantly the
MP3) and a range of exciting new research areas in automatic music
content processing (such as automatic transcription, music
similarity, etc.) that have emerged in the past five years, driven
by the digital music revolution.

New chapter topics include:

* Psychoacoustic Audio Coding, describing MP3 and related
audio coding schemes based on psychoacoustic masking of
quantization noise

* Music Transcription, including automatically deriving
notes, beats, and chords from music signals.

* Music Information Retrieval, primarily focusing on
audio-based genre classification, artist/style identification, and
similarity estimation.

* Audio Source Separation, including multi-microphone
beamforming, blind source separation, and the perception-inspired
techniques usually referred to as Computational Auditory Scene
Analysis (CASA).

Autorentext
The late Ben Gold consulted at Massachusetts Institute of Technology and Lincoln Laboratory and taught at the University of California at Berkeley. He was the author of Digital Processing of Signals and the coauthor of Theory and Applications of Digital Signal Processing. Dr. Gold was an IEEE Fellow, member of the National Academy of Engineering, and recipient of several IEEE awards.

Nelson Morgan is the Director of the International Computer Science Institute, an independent, not-for profit research laboratory affiliated with the University of California at Berkeley. Dr. Morgan is also Professor-in-Residence in the Electrical Engineering and Computer Sciences Department at UC Berkeley. Dr. Morgan is an IEEE Fellow.

Dan Ellis is Associate Professor in the Electrical Engineering Department of Columbia University. Dr. Ellis's Laboratory for Recognition and Organization of Speech and Audio (LabROSA) investigates how to extract high-level information from audio, including speech recognition, music description, and environmental sound processing.

Klappentext
Helps readers develop an intuitive understanding of audio signal processing

Acclaimed for its breadth of coverage as well as its clear, accessible presentation, Speech and Audio Signal Processing examines how machines and humans process audio signals, with an emphasis on speech and music. It begins with basic principles and then explains how these principles set the foundation for a wide range of applications. Moreover, the book is organized into a series of short chapters, offering readers a succinct overview of the range of topics that together represent the current state of knowledge in the field.

This Second Edition brings the book fully up to date with the explosive growth in audio processing technology, including the latest advances in digital music processing and distribution. New topics include:

Psychoacoustic audio coding, examining MP3 and related audio coding schemes that are based on the psychoacoustic masking of quantization noise
Music transcription, explaining how notes, beats, and chords can be automatically derived from music signals
Music information retrieval, exploring audio-based genre classification, artist and style identification, and similarity estimation
Audio source separation, describing multi-microphone beamforming, blind source separation, and perception-inspired techniques

Throughout the book, the authors present both human and machine strategies for accomplishing audio processing tasks. Readers will discover that, in many cases, human strategies can provide the inspiration for the development of machine strategies.

Speech and Audio Signal Processing is recommended for anyone who needs to understand the technologies underlying some of today's most cutting-edge applications, including speech recognition, audio compression, music synthesis, and diarization.

Zusammenfassung
When Speech and Audio Signal Processing published in 1999, it stood out from its competition in its breadth of coverage and its accessible, intutiont-based style. This book was aimed at individual students and engineers excited about the broad span of audio processing and curious to understand the available techniques. Since then, with the advent of the iPod in 2001, the field of digital audio and music has exploded, leading to a much greater interest in the technical aspects of audio processing.

This Second Edition will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution (most significantly the MP3) and a range of exciting new research areas in automatic music content processing (such as automatic transcription, music similarity, etc.) that have emerged in the past five years, driven by the digital music revolution.

New chapter topics include:

Psychoacoustic Audio Coding, describing MP3 and related audio coding schemes based on psychoacoustic masking of quantization noise
Music Transcription, including automatically deriving notes, beats, and chords from music signals.
Music Information Retrieval, primarily focusing on audio-based genre classification, artist/style identification, and similarity estimation.
Audio Source Separation, including multi-microphone beamforming, blind source separation, and the perception-inspired techniques usually referred to as Computational Auditory Scene Analysis (CASA).

Inhalt
PREFACE TO THE 2011 EDITION xxi

CHAPTER 1 INTRODUCTION 1

PART I HISTORICAL BACKGROUND

CHAPTER 2 SYNTHETIC A UDIO: A BRIEF HISTORY 9

CHAPTER 3 SPEECH ANALYSIS AND SYNTHESIS OVERVIEW 21

CHAPTER 4 BRIEF HISTORY OF AUTOMATIC SPEECH RECOGNITION 40

CHAPTER 5 SPEECH-RECOGNITION OVERVIEW 59

PART II MATHEMATICAL BACKGROUND

CHAPTER 6 DIGITAL SIGNAL PROCESSING 73

CHAPTER 7 DIGITAL FILTERSAND DISCRETE FOURIER TRANSFORM 87

CHAPTER 8 PATTERN CLASSIFICATION 105

CHAPTER 9 STATISTICAL PATTERN CLASSIFICATION 124

PART III ACOUSTICS

CHAPTER 10 WAVE BASICS 141

CHAPTER 11 ACOUSTIC TUBE MODELING OF SPEECH PRODUCTION 152

CHAPTER 12 MUSICAL INSTRUMENT ACOUSTICS 158

CHAPTER 13 ROOM ACOUSTICS 179

PART IV AUDITORY PERCEPTION

CHAPTER 14 EAR PHYSIOLOGY 193

CHAPTER 15 PSYCHOACOUSTICS 209

CHAPTER 16 MODELS OF PITCH PERCEPTION 218

CHAPTER 17 SPEECH PERCEPTION 232

CHAPTER 18 HUMAN SPEECH RECOGNITION 250

PART V SPEECH FEATURES

CHAPTER 19 THE AUDITORY SYSTEM AS A FILTER BANK 263

CHAPTER 20 THE CEPSTRUM AS A SPECTRAL ANALYZER 277

CHAPTER 21 LINEAR PREDICTION 286

PART VI A UTOMATIC SPEECH RECOGNITION

CHAPTER 22 FEATURE EXTRACTION FOR ASR 301

CHAPTER 23 LINGUISTIC CATEGORIES FOR SPEECH RECOGNITION 319

CHAPTER 24 DETERMINISTIC SEQUENCE RECOGNITION FOR ASR 337

CHAPTER 25 STATISTICAL SEQUENCE RECOGNITION 350

CHAPT…

Titel

Speech and Audio Signal Processing

Untertitel

Processing and Perception of Speech and Music

Autor

Ben Gold

Nelson Morgan

Dan Ellis

EAN

9781118142912

Format

E-Book (pdf)

Hersteller

Wiley-Interscience

Veröffentlichung

30.09.2011

Digitaler Kopierschutz

Adobe-DRM

Dateigrösse

39.56 MB

Anzahl Seiten

688