logo资料库

Speech and audio signal processing.pdf

第1页 / 共547页
第2页 / 共547页
第3页 / 共547页
第4页 / 共547页
第5页 / 共547页
第6页 / 共547页
第7页 / 共547页
第8页 / 共547页
资料共547页,剩余部分请下载后查看
1.pdf
page1
titles
SPEECH AND AUDIO
page2
titles
BEN GOLD
NELSON MORGAN
images
image1
image2
page3
titles
Acquisitions Editor Bill Zobrist
Marketing Manager Katherine Hepburn
Senior Production Editor Robin Factor
Senior Designer Laura Boucher
Illustration Editor Gene Aiello
Electronic Illustrations Radiant
This book was set in 10/12 Times Roman by TechBooks, and printed and bound by Quebecor/Fairfield.
The book is printed on acid-free paper. §
Library of Congress Cataloging in publication Data:
page4
page5
images
image1
page6
images
image1
page7
tables
table1
page8
images
image1
tables
table1
page9
titles
CONTENTS xi
13.2 Sound Waves in Rooms 179
13.2.1 Acoustic Reverberation 180
13.2.2 Early Reflections 183
13.3 Room Acoustics as a Component in Speech Systems 184
13.4 Exercises 185
images
image1
tables
table1
page10
images
image1
tables
table1
table2
page11
images
image1
tables
table1
table2
page12
tables
table1
page13
images
image1
tables
table1
table2
page14
tables
table1
page15
images
image1
tables
table1
table2
page16
tables
table1
page17
images
image1
tables
table1
page18
tables
table1
page19
titles
.
page20
tables
table1
page21
titles
BIBLIOGRAPHY 5
page22
images
image1
page23
images
image1
page24
images
image1
tables
table1
page25
images
image1
image2
page26
images
image1
tables
table1
page27
images
image1
page28
tables
table1
page29
tables
table1
2.pdf
page1
images
image1
page2
images
image1
page3
titles
Iw I - -- -- --
images
image1
page4
images
image1
image2
page5
images
image1
tables
table1
page6
images
image1
tables
table1
page7
titles
3.1.2 Acoustical Telegraphy before Morse Code
page8
titles
3.1.3 The Telephone
3.1.4 The Channel Vocoder
and Bandwidth Compression
page9
images
image1
page10
images
image1
tables
table1
page11
page12
images
image1
image2
tables
table1
table2
page13
titles
FIGURE 3.4 Wideband spectrogram.
images
image1
image2
page14
tables
table1
page15
images
image1
page16
images
image1
page17
images
image1
page18
images
image1
image2
page19
images
image1
page20
images
image1
page21
images
image1
tables
table1
page22
tables
table1
page23
page24
images
image1
tables
table1
page25
images
image1
tables
table1
page26
tables
table1
page27
images
image1
page28
tables
table1
page29
tables
table1
page30
tables
table1
page31
tables
table1
page32
tables
table1
page33
titles
4.7.2 Front Ends
4.7.3 Hidden Markov Models
page34
titles
4.7.4 The Second (D)ARPA
page35
titles
4.7.5 The Return of Neural Nets
page36
tables
table1
page37
tables
table1
page38
tables
table1
page39
titles
BIBLIOGRAPHY 53
page40
page41
titles
BIBLIOGRAPHY 55
page42
images
image1
tables
table1
page43
tables
table1
page44
page45
tables
table1
page46
images
image1
page47
tables
table1
page48
images
image1
page49
page50
tables
table1
page51
tables
table1
page52
tables
table1
page53
images
image1
page54
tables
table1
page55
images
image1
image2
tables
table1
page56
images
image1
image2
tables
table1
page57
images
image1
page58
images
image1
image2
tables
table1
page59
images
image1
tables
table1
page60
images
image1
page61
images
image1
3.pdf
page1
images
image1
page2
images
image1
page3
images
image1
page4
images
image1
tables
table1
page5
images
image1
image2
page6
titles
FIGURE 6.15 Digital filter structure.
images
image1
image2
page7
images
image1
tables
table1
page8
images
image1
tables
table1
page9
tables
table1
page10
images
image1
page11
images
image1
image2
page12
images
image1
page13
images
image1
page14
images
image1
page15
images
image1
page16
images
image1
tables
table1
page17
images
image1
tables
table1
table2
page18
images
image1
page19
images
image1
page20
images
image1
image2
tables
table1
page21
images
image1
page22
images
image1
page23
images
image1
tables
table1
page24
images
image1
page25
images
image1
tables
table1
page26
titles
EXERCISES 1 01
images
image1
image2
page27
tables
table1
page28
images
image1
tables
table1
page29
images
image1
image2
page30
images
image1
tables
table1
page31
titles
8.2.1 Some Opinions
page32
titles
8.3 PATTERN-CLASSIFICATION METHODS
8.3.1 Minimum Distance Classifiers
images
image1
page33
images
image1
page34
titles
8.3.2 Discriminant Functions
images
image1
page35
titles
8.3.3 Generalized Discriminators
images
image1
page36
images
image1
image2
page37
images
image1
image2
page38
tables
table1
4.pdf
page1
images
image1
tables
table1
page2
images
image1
page3
images
image1
image2
page4
images
image1
tables
table1
page5
page6
images
image1
tables
table1
page7
images
image1
image2
page8
images
image1
page9
images
image1
page10
images
image1
tables
table1
page11
images
image1
image2
page12
images
image1
tables
table1
table2
page13
tables
table1
page14
images
image1
page15
images
image1
image2
page16
images
image1
page17
images
image1
image2
page18
titles
9.8.1 Discussion
images
image1
page19
tables
table1
page20
titles
BIBLIOGRAPHY 133
page21
images
image1
page22
tables
table1
page23
images
image1
tables
table1
5.pdf
page1
titles
10.4 BOUNDARY CONDITIONS AND DISCRETE
10.5 STANDING WAVES
images
image1
page2
images
image1
tables
table1
page3
images
image1
image2
page4
images
image1
page5
images
image1
image2
page6
images
image1
page7
images
image1
image2
image3
tables
table1
table2
page8
tables
table1
page9
titles
148
images
image1
tables
table1
page10
images
image1
page11
images
image1
image2
page12
images
image1
page13
images
image1
image2
tables
table1
page14
tables
table1
page15
images
image1
tables
table1
page16
images
image1
tables
table1
page17
images
image1
tables
table1
page18
images
image1
6.pdf
page1
images
image1
tables
table1
page2
images
image1
page3
images
image1
image2
page4
tables
table1
page5
images
image1
page6
images
image1
page7
images
image1
image2
page8
images
image1
page9
images
image1
page10
images
image1
tables
table1
page11
images
image1
page12
images
image1
page13
images
image1
page14
images
image1
page15
images
image1
image2
page16
images
image1
page17
titles
FIGURE 12.21 The evolution of a trumpet: effects of the mouthpiece and bell.
images
image1
page18
images
image1
tables
table1
page19
tables
table1
page20
images
image1
image2
tables
table1
page21
titles
13.1.1 One-Dimensional Wave Equation
images
image1
page22
images
image1
page23
titles
13.1.5 Typical Power Sources
images
image1
page24
tables
table1
page25
tables
table1
page26
images
image1
page27
images
image1
page28
titles
13.2.2 Early Reflections
images
image1
page29
images
image1
tables
table1
page30
tables
table1
page31
tables
table1
page32
images
image1
page33
images
image1
tables
table1
page34
images
image1
tables
table1
page35
images
image1
image2
page36
tables
table1
page37
images
image1
tables
table1
page38
images
image1
page39
images
image1
image2
tables
table1
page40
images
image1
image2
page41
images
image1
page42
images
image1
page43
images
image1
page44
page45
images
image1
page46
images
image1
7.pdf
page1
images
image1
page2
tables
table1
page3
page4
images
image1
tables
table1
page5
images
image1
tables
table1
table2
page6
images
image1
page7
images
image1
image2
tables
table1
page8
images
image1
image2
page9
titles
15.4 MASKING
images
image1
page10
images
image1
image2
page11
tables
table1
page12
tables
table1
page13
images
image1
tables
table1
page14
images
image1
image2
page15
images
image1
image2
image3
page16
images
image1
page17
images
image1
page18
images
image1
tables
table1
page19
images
image1
image2
tables
table1
8.pdf
page1
images
image1
image2
page2
images
image1
page3
images
image1
page4
images
image1
tables
table1
page5
images
image1
page6
tables
table1
page7
titles
BIBLIOGRAPHY 227
page8
images
image1
tables
table1
page9
images
image1
page10
images
image1
image2
page11
images
image1
tables
table1
page12
images
image1
image2
page13
images
image1
page14
tables
table1
page15
images
image1
image2
tables
table1
page16
images
image1
page17
images
image1
tables
table1
page18
images
image1
tables
table1
page19
images
image1
page20
images
image1
page21
images
image1
page22
titles
FIGURE 17.15 Interval histograms for two periods of a steady vowel. Plot B shows
images
image1
page23
images
image1
tables
table1
page24
tables
table1
page25
titles
BIBLIOGRAPHY 245
page26
images
image1
tables
table1
page27
titles
18.2.2 The Experiments
images
image1
page28
tables
table1
page29
titles
249
images
image1
tables
table1
page30
titles
250
images
image1
page31
tables
table1
page32
tables
table1
page33
tables
table1
page34
images
image1
page35
images
image1
page36
images
image1
tables
table1
page37
images
image1
page38
images
image1
tables
table1
page39
images
image1
image2
page40
images
image1
page41
images
image1
page42
images
image1
page43
images
image1
image2
tables
table1
page44
images
image1
image2
9.pdf
page1
images
image1
image2
page2
images
image1
page3
images
image1
tables
table1
page4
tables
table1
page5
page6
images
image1
tables
table1
page7
images
image1
tables
table1
page8
images
image1
tables
table1
page9
images
image1
page10
images
image1
tables
table1
page11
images
image1
page12
images
image1
tables
table1
page13
images
image1
tables
table1
table2
page14
tables
table1
page15
titles
280
images
image1
image2
tables
table1
page16
titles
.. -
images
image1
image2
page17
images
image1
image2
page18
images
image1
image2
page19
images
image1
tables
table1
table2
page20
images
image1
page21
images
image1
tables
table1
page22
images
image1
page23
images
image1
image2
tables
table1
page24
images
image1
tables
table1
page25
page26
tables
table1
page27
images
image1
page28
images
image1
page29
images
image1
tables
table1
page30
images
image1
image2
image3
page31
images
image1
page32
images
image1
page33
images
image1
page34
images
image1
tables
table1
table2
page35
images
image1
page36
images
image1
image2
image3
page37
images
image1
page38
titles
22.4.2 Robustness to Additive Noise
22.4.3 Caveats
page39
tables
table1
page40
images
image1
tables
table1
page41
tables
table1
page42
page43
images
image1
tables
table1
page44
titles
23.2.2 What Makes a Phone?
23.2.3 What Makes a Phoneme?
page45
tables
table1
10.pdf
page1
tables
table1
page2
tables
table1
page3
titles
314
images
image1
page4
page5
titles
23.4.3 Vowels
23.4.4 Why Use Features?
images
image1
page6
images
image1
tables
table1
page7
images
image1
image2
tables
table1
page8
tables
table1
page9
tables
table1
page10
tables
table1
page11
tables
table1
page12
titles
BIBLIOGRAPHY 323
page13
images
image1
tables
table1
page14
tables
table1
page15
images
image1
page16
images
image1
page17
images
image1
page18
images
image1
page19
images
image1
image2
page20
titles
24.2.3 Distances
24.2.4 End-Point Detection
page21
images
image1
page22
tables
table1
page23
titles
24.4 SEGMENTAL APPROACHES
images
image1
page24
tables
table1
page25
tables
table1
page26
images
image1
tables
table1
page27
images
image1
tables
table1
table2
page28
images
image1
image2
page29
images
image1
tables
table1
page30
titles
25.3.1 Markov Models
images
image1
image2
page31
images
image1
image2
tables
table1
page32
titles
25.3.2 Hidden Markov Model
images
image1
image2
page33
titles
25.3.3 HMMs for Speech Recognition
images
image1
page34
titles
25.3.4 Estimation of P(XIM)
page35
images
image1
image2
page36
images
image1
page37
images
image1
page38
tables
table1
page39
images
image1
page40
images
image1
tables
table1
page41
images
image1
tables
table1
table2
page42
images
image1
image2
page43
images
image1
image2
page44
images
image1
tables
table1
page45
images
image1
image2
page46
images
image1
page47
images
image1
tables
table1
page48
images
image1
page49
tables
table1
page50
images
image1
11.pdf
page1
images
image1
image2
page2
tables
table1
page3
titles
26.6.3 Tied Mixtures of Gaussians
images
image1
tables
table1
page4
images
image1
tables
table1
page5
tables
table1
page6
images
image1
tables
table1
page7
images
image1
page8
titles
27.2.2 Corrective Training
images
image1
page9
images
image1
image2
page10
titles
27.2.4 Direct Estimation of Posteriors
images
image1
page11
images
image1
image2
page12
images
image1
page13
tables
table1
page14
titles
27.3.3 Embedded Training
images
image1
page15
tables
table1
page16
images
image1
tables
table1
page17
images
image1
tables
table1
page18
titles
BIBLIOGRAPHY 379
page19
images
image1
tables
table1
page20
images
image1
tables
table1
page21
images
image1
image2
image3
page22
images
image1
tables
table1
table2
page23
page24
images
image1
page25
titles
28.3.2 Smoothing .
page26
images
image1
tables
table1
page27
images
image1
tables
table1
page28
tables
table1
page29
page30
tables
table1
page31
images
image1
page32
images
image1
page33
images
image1
tables
table1
page34
tables
table1
page35
images
image1
tables
table1
page36
images
image1
image2
tables
table1
page37
titles
29.2.2 Other Source-Filter Synthesizer Structures
images
image1
page38
images
image1
page39
images
image1
page40
titles
FIGURE 29.6 All-zero synthesizer based on cepstral analysis.
29.2.3 Talking Chips
images
image1
image2
page41
images
image1
tables
table1
page42
page43
tables
table1
page44
tables
table1
page45
titles
29.6.2 Development of Speech Synthesizers
page46
images
image1
image2
page47
titles
29.6.3 Segmental Synthesis by Rule
images
image1
image2
page48
tables
table1
page49
titles
29.8.1 The van Santen Recordings
page50
tables
table1
page51
titles
BIBLIOGRAPHY 413
12.pdf
page1
page2
images
image1
tables
table1
page3
images
image1
tables
table1
page4
images
image1
image2
page5
tables
table1
page6
images
image1
page7
images
image1
image2
page8
images
image1
page9
images
image1
tables
table1
page10
images
image1
page11
titles
FIGURE 30.11 Estimation of periods by elementary pitch detectors.
images
image1
page12
images
image1
page13
titles
426
images
image1
tables
table1
page14
images
image1
image2
page15
titles
30.9 EXERCISES
images
image1
page16
tables
table1
page17
page18
images
image1
tables
table1
page19
images
image1
image2
page20
page21
images
image1
tables
table1
page22
images
image1
page23
images
image1
tables
table1
page24
images
image1
page25
images
image1
page26
images
image1
tables
table1
page27
images
image1
image2
tables
table1
page28
images
image1
image2
page29
images
image1
tables
table1
page30
images
image1
tables
table1
page31
images
image1
image2
page32
images
image1
page33
images
image1
tables
table1
page34
tables
table1
page35
tables
table1
page36
tables
table1
page37
page38
images
image1
tables
table1
page39
images
image1
tables
table1
table2
page40
tables
table1
page41
tables
table1
page42
images
image1
tables
table1
table2
page43
images
image1
page44
tables
table1
page45
page46
page47
tables
table1
page48
page49
images
image1
tables
table1
page50
images
image1
page51
images
image1
13.pdf
page1
tables
table1
page2
images
image1
image2
page3
images
image1
tables
table1
page4
titles
FIGURE 33.8 Adaptive differential pulse code modulation (ADPCM).
images
image1
image2
page5
images
image1
image2
tables
table1
page6
images
image1
tables
table1
page7
images
image1
tables
table1
page8
images
image1
image2
page9
images
image1
image2
tables
table1
table2
page10
images
image1
page11
titles
33.9.1 Modifications to CELP
33.9.2 Non-Gaussian Codebook Sequences
33.9.3 Low-Delay CELP
page12
images
image1
page13
tables
table1
page14
images
image1
image2
tables
table1
page15
images
image1
page16
titles
33.10.3 Multiresolution Codebook Search
images
image1
page17
titles
33.10.4 Partial Sequence Elimination
33.10.5 Tree-Structured Delta Codebooks
images
image1
page18
titles
33.10.6 Adaptive Codebooks
images
image1
page19
titles
33.10.7 Linear Combination Codebooks
page20
tables
table1
page21
tables
table1
page22
tables
table1
page23
page24
images
image1
page25
images
image1
page26
images
image1
tables
table1
page27
images
image1
image2
page28
images
image1
page29
images
image1
tables
table1
page30
images
image1
tables
table1
page31
images
image1
page32
images
image1
page33
tables
table1
page34
images
image1
tables
table1
page35
images
image1
page36
titles
34.7.1 Frequency Compression
images
image1
page37
images
image1
tables
table1
page38
images
image1
page39
images
image1
tables
table1
page40
tables
table1
page41
page42
images
image1
tables
table1
page43
images
image1
page44
images
image1
tables
table1
page45
images
image1
image2
page46
images
image1
image2
tables
table1
page47
images
image1
page48
images
image1
image2
page49
images
image1
tables
table1
page50
images
image1
14.pdf
page1
tables
table1
page2
titles
w w
35.7 SEVERAL EXAMPLES OF SYNTHESIS
images
image1
page3
images
image1
page4
images
image1
tables
table1
page5
tables
table1
page6
images
image1
tables
table1
page7
tables
table1
page8
images
image1
page9
images
image1
page10
images
image1
tables
table1
page11
tables
table1
page12
tables
table1
page13
titles
528
images
image1
tables
table1
page14
tables
table1
page15
page16
titles
531
images
image1
page17
tables
table1
page18
titles
INDEX 533
page19
titles
534 INDEX
page20
tables
table1
page21
titles
536 INDEX
page22
titles
INDEX 537
SPEECH AND AUDIO SIGNAL PROCESSING Processing and Perception of Speech and Music
BEN GOLD Massachusetts Lincoln Laboratory Institute of Technology NELSON MORGAN University of California at Berkeley International Computer Science Institute _ with contributions from Herve Bourlard Eric Fosler-Lussier Jeff Gilbert
Bill Zobrist Katherine Hepburn Robin Factor Acquisitions Editor Marketing Manager Senior Production Editor Senior Designer Illustration Editor Electronic Illustrations Laura Boucher Gene Aiello Radiant This book was set in 10/12 Times Roman by TechBooks, and printed and bound by Quebecor/Fairfield. The cover was printed by Lehigh Press. The book is printed on acid-free paper. § Copyright © 2000 by John Wiley & Sons, Inc. All rights reserved. stored in a retrieval system or transmitted scanning recording, No part of this publication may be reproduced, in any form or by any means, electronic, mechanical, photocopying, or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or through payment of tbi: appropriate per-copy fee to the Copyright authorization Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, (978) 750-4470. Requests to the Publisher Permissions Department, 10158-0012, To order books please call 1(800)-225-5945. John Wiley & Sons, Inc., 605 Third Avenue, New York, NY fax (212) 850-6008, E-Mail: PERMRFQ@WILEY.COM. for permission should be addressed to the (212) 850-6011, fax Library of Congress Cataloging in publication Data: Gold, Ben, 1923- Speech and audio signal processing : processing and perception of speech, and music I Ben Gold, Nelson Morgan Herve Bourlard, Eric Fosler-Lussier, and Jeff Gilbert, ; with contributions from p. cm. Includes Index. ISBN 0-471-35154-7 1. Speech processing systems. 3. Electronic music. (alk. paper) techniques. TK7882.S65G65 621.382'2-dc21 1999 2. Signal processing-Digital 1. Morgan. Nelson. 99-16025 CIP ISBN 0-471-35154-7 Printed in the United States of America 10 9 8 7 6 5 4 3 2
This book is dedicated to our families and our students
CHAPTER 7 CHAPTER 8 CHAPTER 9 6.9 6.10 Concluding Comments Exercises 79 79 CONTENTS ix DIGITAL FILTERS AND DISCRETE FOURIER TRANSFORM 83 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 7.9 83 Introduction 84 Filtering Concepts Useful Filter Functions Transformations Digital Filter Design with Bilinear Transformation The Discrete Fourier Transform 92 95 Fast Fourier Transform Methods Relation Between the DFT and Digital Filters Exercises for Digital Filter Design 100 98 88 90 91 103 PATTERN CLASSIFICATION 8.1 8.2 8.3 105 Some Opinions 103 Introduction Feature Extraction 8.2.1 106 Pattern-Classification Methods 8.3.1 8.3.2 8.3.3 Exercises Appendix: Multilayer Perception Training 8.5.1 8.5.2 Minimum Distance Classifiers 109 Discriminant Functions Generalized Discriminators Definitions Derivation 114 115 8.4 8.5 113- 107 107 110 114 119 119 STATISTICAL PATTERN CLASSIFICATION 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 119 Introduction A Few Definitions Class-Related Probability Functions 121 Minimum Error Classification Likelihood-Based MAP Classification 123 Approximating a Bayes Classifier Statistically Based Linear Discriminants 9.7.1 Iterative Training: 9.8.1 Exercises Discussion Discussion 126 131 132 120 122 125 The EM Algorithm 126
CHAPTER 10 CHAPTER 11 CHAPTER 12 CHAPTER 13 137 137 WAVE BASICS 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 137 Introduction The Wave Equation for the Vibrating String Discrete-Time Traveling Waves Boundary Conditions and Discrete Traveling Waves Standing Waves Discrete- Time Models of Acoustic Tubes Acoustic Tube Resonances Relation of Acoustic Tube Resonances to Observed Formant 139 140 141 143 140 Frequencies 144 10.9 Exercises 146 ACOUSTIC TUBE MODELING OF SPEECH PRODUCTION 11.1 11.2 11.3 11.4 Introduction Acoustic Tube Models of English Phonemes Excitation Mechanisms in Speech Production Exercises 148 152 148 153 1'54 MUSIC PRODUCTION 154 12.1 12.2 12.3 12.4 12.5 Introduction Sequence of Steps in a Plucked or Bowed String Instrument Vibrations of the Bowed String Frequency-Response Measurements of the Bridge of a Violin Vibrations of the Body of String Instruments: Measurement 155 Methods 159 148 155 156 12.6 12.7 12.8 12.9 Radiation Pattern of Bowed String Instruments 165 Some Considerations Brief Discussion of the Trumpet, Trombone, French Horn, and Tuba Exercises in Piano Design 173 163 171 175 175 ROOM ACOUSTICS Sound Waves 13.1 13.1.1 13.1.2 13.1.3 13.1.4 13.1.5 176 One-Dimensional Wave Equation 177 Spherical Wave Equation Intensity Decibel Sound Levels Typical Power Sources 178 178 177
分享到:
收藏