SPECTRAL AUDIO
SIGNAL PROCESSING
JULIUS O. SMITH III
Center for Computer Research in Music and Acoustics (CCRMA)
Department of Music, Stanford University, Stanford, California 94305 USA
October 2008 DRAFT
Chapter 2
2.2.1
2.3.12.1
Chapter 1
1.1
1.2
2.1
2.2
2.3
2.3.1
2.3.2
2.3.3
Preface ............................................................................................................................................ 17
Acknowledgments ................................................................................................................... 17
Book Series Overview ............................................................................................................ 18
Introduction and Overview ..................................................................................... 20
Organization ............................................................................................................ 20
Overview ................................................................................................................. 21
Elementary Spectrum Analysis ....................................................................... 21
1.2.1
The Short-Time Fourier Transform (STFT) and Time-Frequency Displays ... 22
1.2.2
Short-Time Analysis, Modification, and Resynthesis ..................................... 22
1.2.3
1.2.4
Applications .................................................................................................... 22
1.2.5 Multirate Polyphase Filter and Wavelet Banks ............................................... 23
Appendices ...................................................................................................... 23
1.2.6
Fourier Transforms for Continuous/Discrete Time/Frequency ............................... 24
Discrete Time Fourier Transform (DTFT) .............................................................. 25
Fourier Transform (FT) and Inverse ........................................................................ 26
Existence of the Fourier Transform ................................................................. 26
Fourier Theorems for the DTFT .............................................................................. 27
Linearity of the DTFT ..................................................................................... 27
Time Reversal ................................................................................................. 28
Symmetry of the DTFT for Real Signals ........................................................ 28
2.3.3.1 Real Even (or Odd) Signals ..................................................................... 29
2.3.4
Shift Theorem for the DTFT ........................................................................... 30
2.3.5
Convolution Theorem for the DTFT ............................................................... 31
2.3.6
Correlation Theorem for the DTFT ................................................................. 32
2.3.7
Autocorrelation ............................................................................................... 33
2.3.8
Power Theorem for the DTFT ......................................................................... 33
2.3.9
Stretch Operator .............................................................................................. 34
2.3.10 Repeat (Scaling) Operator ............................................................................... 35
2.3.11
Stretch/Repeat (Scaling) Theorem .................................................................. 36
2.3.12 Downsampling and Aliasing ........................................................................... 37
Proof of Aliasing Theorem .............................................................. 38
2.3.13 Differentiation Theorem Dual ......................................................................... 39
Continuous-Time Fourier Theorems ....................................................................... 40
Scaling Theorem ............................................................................................. 40
Spectral Roll-Off ............................................................................................. 41
Spectral Interpolation .............................................................................................. 42
Ideal Spectral Interpolation ............................................................................. 42
Interpolating a DFT ......................................................................................... 43
Zero Padding in the Time Domain .................................................................. 43
2.5.3.1 Practical Zero Padding ............................................................................ 44
Zero-Phase Zero Padding ................................................................................ 47
2.5.4.1 Matlab/Octave fftshift utility ................................................................... 49
Spectrum Analysis Windows ................................................................................... 52
Rectangular Window ............................................................................................... 52
2.5.1
2.5.2
2.5.3
Chapter 3
3.1
2.4
2.5
2.4.1
2.4.2
2.5.4
3.1.1
3.1.2
3.1.3
Definition (
odd): ...................................................................................... 53
Transform: ....................................................................................................... 53
Properties: ....................................................................................................... 53
Generalized Hamming Window Family .................................................................. 54
3.2.1
Hann or Hanning or Raised Cosine ................................................................. 55
3.2.2 Matlab for the Hann Window .......................................................................... 56
3.2.3
Summary of Hann window properties: ........................................................... 58
3.2.4
Hamming Window .......................................................................................... 58
3.2.5 Matlab for the Hamming Window .................................................................. 61
Summary of Generalized Hamming Windows ................................................ 62
3.2.6
3.2.7
Definition: ....................................................................................................... 62
Transform: ....................................................................................................... 62
3.2.8
3.2.9
Common Properties ......................................................................................... 62
3.2.10 Rectangular window properties: ...................................................................... 63
3.2.11 Hann window properties: ................................................................................ 63
3.2.12 Hamming window properties: ......................................................................... 64
3.2.13
The MLT Sine Window ................................................................................... 64
3.2.13.1
Properties: ....................................................................................... 64
Blackman-Harris Window Family .......................................................................... 65
3.3.1 Window Definition: ......................................................................................... 65
3.3.2 Window Transform: ........................................................................................ 65
Blackman Window Family .............................................................................. 65
3.3.3
3.3.4
Classic Blackman ............................................................................................ 66
3.3.5 Matlab for the Classic Blackman Window ...................................................... 66
Three-Term Blackman-Harris Window ........................................................... 67
3.3.6
Frequency-Domain Implementation of the Blackman-Harris Family .......... 69
3.3.7
3.3.8
Power-of-Cosine Window Family ................................................................... 69
3.3.8.1 Definition: ............................................................................................... 69
3.3.8.2 Properties: ............................................................................................... 69
3.3.8.3 Special Cases: .......................................................................................... 69
Example: Spectrum Analysis of an Oboe Tone ....................................................... 70
Rectangular-Windowed Oboe Recording ........................................................ 70
Hamming-Windowed Oboe Recording ........................................................... 71
Blackman-Windowed Oboe Recording ........................................................... 71
Conclusions ..................................................................................................... 72
Bartlett (``Triangular'') Window .............................................................................. 72
Definition: ....................................................................................................... 72
3.5.1
Transform: ....................................................................................................... 73
3.5.2
Properties: ....................................................................................................... 73
3.5.3
3.5.4 Matlab for the Bartlett Window: ..................................................................... 73
Poisson Window ...................................................................................................... 74
Definition: ....................................................................................................... 74
Hann-Poisson Window ............................................................................................ 76
Definition: ....................................................................................................... 76
3.4.1
3.4.2
3.4.3
3.4.4
3.6.1
3.7.1
3.2
3.3
3.4
3.5
3.6
3.7
3
3.8
3.9
3.10
3.11
3.12
3.13
3.7.2 Matlab for the Hann-Poisson Window ............................................................ 78
Slepian or DPSS Window ....................................................................................... 78
3.8.1 Matlab for the DPSS Window ......................................................................... 80
Kaiser Window ........................................................................................................ 81
3.9.1
Definition: ....................................................................................................... 81
3.9.2 Window transform: ......................................................................................... 81
3.9.3
Kaiser Window Beta Parameter ...................................................................... 82
3.9.4
Kaiser Windows and Transforms .................................................................... 82
3.9.5 Minimum Frequency Separation vs. Window Length ..................................... 87
Kaiser and DPSS Windows Compared ........................................................... 88
3.9.6
Dolph-Chebyshev Window ..................................................................................... 90
3.10.1 Matlab for the Dolph-Chebyshev Window ..................................................... 91
3.10.2
Example Chebyshev Windows and Transforms .............................................. 91
3.10.3 Dolph-Chebyshev and Hamming Windows Compared .................................. 94
3.10.4 Dolph-Chebyshev Window Theory ................................................................. 94
Chebyshev Polynomials .................................................................. 94
3.10.4.1
Dolph-Chebyshev Window Definition ............................................ 95
3.10.4.2
Dolph-Chebyshev Window Main-Lobe Width ................................ 96
3.10.4.3
3.10.4.4
Dolph-Chebyshev Window Length Computation ........................... 96
Gaussian Window and Transform ........................................................................... 97
3.11.1 Matlab for the Gaussian Window .................................................................... 97
3.11.2 Gaussian Window and Transform ................................................................... 98
3.11.3
Exact Discrete Gaussian Window ................................................................... 98
Optimized Windows ................................................................................................ 99
3.12.1 Optimal Windows for Audio Coding............................................................... 99
3.12.2 General Rule.................................................................................................. 100
Optimal Window Design by Linear Programming ................................................ 100
3.13.1
Linear Programming (LP) ............................................................................. 100
3.13.2 Matlab's LINPROG ....................................................................................... 100
3.13.3
LP Formulation of Chebyshev Window Design ............................................ 102
Symmetric Window Constraint ..................................................................... 103
3.13.4
3.13.5
Positive Window Sample Constraint ............................................................. 103
3.13.6 DC Constraint................................................................................................ 103
Sidelobe Specification ................................................................................... 104
3.13.7
LP Standard Form ......................................................................................... 105
3.13.8
Normal Chebyshev Window ......................................................... 106
3.13.9 Remez Exchange Algorithm .......................................................................... 107
Convergence of Remez Exchange ................................................. 107
Monotonicity Constraint ....................................................................... 108
Monotonic Chebyshev Window .................................................... 109
3.13.11 L-Infinity Norm of Derivative Objective ...................................................... 110
L-One Norm of Derivative Objective ................................................... 112
3.13.12
Spectrum Analysis of Sinusoids ............................................................................ 116
Spectrum of a Sinusoid ......................................................................................... 117
3.13.10.1
3.13.9.1
3.13.8.1
3.13.10
Chapter 4
4.1
4
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
4.6.1
4.8.1
4.8.2
4.5.1
4.5.2
4.7.1
4.7.2
Spectrum of Sampled Complex Sinusoid .............................................................. 119
Spectrum of a Windowed Sinusoid ....................................................................... 120
Effect of Windowing ............................................................................................. 122
The Rectangular Window ...................................................................................... 125
Rectangular Window Side-Lobes .................................................................. 129
Frequency Resolution .................................................................................... 131
4.5.2.1 Two Cosines (``In-Phase'' Case) ............................................................ 132
4.5.2.2 One Sine and One Cosine ``Phase Quadrature'' Case ............................ 133
Main-Lobe Bandwidth .......................................................................................... 135
Other Definitions of Main Lobe Width ......................................................... 138
Choosing Window Length to Resolve Sinusoids .................................................. 139
Periodic Signals ............................................................................................. 141
Tighter Bounds for Minimum Window Length ............................................. 143
Sinusoidal Peak Interpolation ............................................................................... 145
Quadratic Interpolation of Spectral Peaks ..................................................... 146
4.8.1.1 Phase Interpolation at a Peak ................................................................ 148
4.8.1.2 Matlab for Parabolic Peak Interpolation ............................................... 148
Bias of Parabolic Peak Interpolation ............................................................. 149
Optimal Peak-Finding in the Spectrum ................................................................. 149
4.9.1 Minimum Zero-Padding for High-Frequency Peaks ..................................... 150
4.9.2 Minimum Zero-Padding for Low-Frequency Peaks ...................................... 151
4.9.3 Matlab for Computing Minimum Zero-Padding Factors .............................. 153
4.9.4
Least Squares Sinusoidal Parameter Estimation ........................................... 153
4.9.4.1 Sinusoidal Amplitude Estimation .......................................................... 155
4.9.4.2 Sinusoidal Amplitude and Phase Estimation ......................................... 156
4.9.4.3 Sinusoidal Frequency Estimation .......................................................... 158
4.9.5 Maximum Likelihood Sinusoid Estimation .................................................. 158
Likelihood Function ...................................................................................... 160
4.9.6
4.9.6.1 Multiple Sinusoids in Additive Gaussian White Noise ......................... 161
4.9.6.2 Non-White Noise .................................................................................. 161
Generality of Maximum Likelihood Least Squares ...................................... 161
Spectrum Analysis of Noise .................................................................................. 163
Introduction to Noise ............................................................................................. 165
5.1.1 Why Analyze Noise? ..................................................................................... 165
5.1.2 What is Noise? .............................................................................................. 165
Spectral Characteristics of Noise .......................................................................... 166
White Noise ........................................................................................................... 166
Testing for White Noise ................................................................................ 167
Sample Autocorrelation ......................................................................................... 167
Sample Power Spectral Density ............................................................................ 170
Biased Sample Autocorrelation ............................................................................. 171
Smoothed Power Spectral Density ........................................................................ 172
Cyclic Autocorrelation .......................................................................................... 172
Practical Bottom Line ........................................................................................... 173
4.9.7
Chapter 5
5.3.1
5
5.10
5.11
5.12
5.13
5.14
6.1
6.2
6.3
6.4
7.1
5.14.1
5.14.2
5.14.3
Why an Impulse is Not White Noise ..................................................................... 173
The Periodogram ................................................................................................... 174
5.11.1 Matlab for the Periodogram .......................................................................... 175
Welch's Method ..................................................................................................... 176
5.12.1 Welch Autocorrelation Estimate .................................................................... 176
5.12.2 Resolution versus Stability ............................................................................ 177
Welch's Method with Windows ............................................................................. 177
5.13.1 Matlab for Welch's Method ........................................................................... 177
Filtered White Noise ............................................................................................. 178
Example: FIR-Filtered White Noise .............................................................. 181
Example: Synthesis of 1/F Noise (Pink Noise) ............................................. 182
Example: Pink Noise Analysis ...................................................................... 183
Processing Gain ..................................................................................................... 184
The Panning Problem ............................................................................................ 187
Time-Frequency Displays ..................................................................................... 188
The Short-Time Fourier Transform ....................................................................... 188
6.1.1 Mathematical Definition of the STFT ........................................................... 188
Practical Computation of the STFT .............................................................. 190
6.1.2
Summary of STFT Computation Using the FFT ........................................... 191
6.1.3
6.1.4
Two Dual Interpretations of the STFT .......................................................... 193
STFT in Matlab ............................................................................................. 193
6.1.5
Classic Spectrograms ............................................................................................ 194
Spectrogram of Speech ................................................................................. 195
Audio Spectrograms .............................................................................................. 196
Auditory Filter Banks .................................................................................... 197
Loudness Spectrogram .................................................................................. 197
Loudness Spectrogram Examples ................................................................. 199
6.3.3.1 Multiresolution STFT ............................................................................ 199
6.3.3.2 Excitation Pattern .................................................................................. 200
6.3.3.3 Nonuniform Spectral Resampling ......................................................... 201
6.3.3.4 Specific Loudness ................................................................................. 203
6.3.3.5 Spectrograms Compared ....................................................................... 204
6.3.3.6
Instantaneous, Short-Term, and Long-Term Loudness.......................... 205
Summary ............................................................................................................... 206
Overlap-Add (OLA) STFT Processing ................................................................. 207
Convolution of Short Signals ................................................................................ 209
Cyclic FFT Convolution ................................................................................ 210
Acyclic FFT Convolution .............................................................................. 211
7.1.2.1 Acyclic Convolution in Matlab or Octave ............................................ 211
7.1.2.2 Pictorial View of Acyclic Convolution .................................................. 212
Acyclic FFT Convolution in Matlab or Octave ............................................. 213
FFT versus Direct Convolution ..................................................................... 214
7.1.4.1 Audio FIR Filters .................................................................................. 214
7.1.4.2 Example 1: Low-Pass Filtering by FFT Convolution ........................... 215
6.3.1
6.3.2
6.3.3
7.1.3
7.1.4
7.1.1
7.1.2
6.2.1
5.15
5.16
Chapter 6
Chapter 7
6
7.2
7.3
7.4
7.5
7.6
7.7
8.2.1
7.5.1
7.5.2
7.3.1
7.3.2
7.3.3
7.3.4
7.3.5
7.2.1
7.2.2
7.2.3
7.2.4
7.2.5
7.2.6
7.2.7
7.1.4.3 Example 2: Time Domain Aliasing ....................................................... 219
Convolving with Long Signals .............................................................................. 220
Overlap-Add Decomposition ........................................................................ 221
COLA Examples ........................................................................................... 225
STFT of COLA Decomposition .................................................................... 226
Acyclic Convolution ..................................................................................... 228
Example of Overlap-Add Convolution ......................................................... 229
Overlap-Add FFT Processing Summary ....................................................... 232
The STFT as a Time-Frequency Distribution ................................................ 233
7.2.7.1 Time-Frequency Parameters in the STFT ............................................. 234
Dual of Constant Overlap-Add ............................................................................. 234
Poisson Summation Formula ........................................................................ 235
Frequency-Domain COLA Constraints ......................................................... 237
7.3.2.1 Strong COLA ........................................................................................ 237
PSF Dual and Graphical Equalizers .............................................................. 238
PSF and Weighted Overlap Add .................................................................... 239
Example COLA Windows for WOLA .......................................................... 240
Overlap-Save Method ........................................................................................... 241
Time Varying OLA Modifications......................................................................... 241
Block Diagram Interpretation of Time-Varying STFT Modifications ........... 243
Length L FIR Frame Filters ........................................................................... 244
Nonlinear Modifications ....................................................................................... 245
Weighted Overlap Add .......................................................................................... 246
7.7.1 WOLA Processing Steps ............................................................................... 246
7.7.1.1 Choice of WOLA Window .................................................................... 247
Review of Zero Padding ........................................................................................ 248
Chapter 8
The Filter Bank Summation (FBS) Interpretation of the Short Time Fourier
Transform (STFT) ......................................................................................................................... 249
Dual Views of the Short Time Fourier Transform (STFT) .................................... 249
Overlap-Add (OLA) Interpretation of the STFT ........................................... 249
Filter-Bank Summation (FBS) Interpretation of the STFT ........................... 250
FBS and Perfect Reconstruction ................................................................... 252
STFT Filter Bank .................................................................................................. 252
Computational Examples in Matlab .............................................................. 253
The DFT Filter Bank ............................................................................................. 259
The Running-Sum Lowpass Filter ................................................................ 260
8.3.1
8.3.2 Modulation by a Complex Sinusoid .............................................................. 262
8.3.3 Making a Bandpass Filter from a Lowpass Filter ......................................... 263
8.3.4
Uniform Running-Sum Filter Banks ............................................................. 264
8.3.4.1 System Diagram of the Running-Sum Filter Bank ................................ 265
8.3.4.2 DFT Filter Bank .................................................................................... 266
8.3.4.3
Inverse DFT and the DFT Filter Bank Sum .......................................... 267
FBS Window Constraints for R=1 ........................................................................ 267
Nyquist(N) Windows ............................................................................................ 269
8.1.1
8.1.2
8.1.3
8.2
8.3
7.8
8.1
8.4
8.5
7
8.6
8.7
8.8
8.9
8.10
8.8.1
8.8.2
8.8.3
8.6.1
8.6.2
Duality of COLA and Nyquist Conditions ............................................................ 270
Specific Windows .......................................................................................... 270
The Nyquist Property on the Unit Circle ....................................................... 271
Portnoff Windows ................................................................................................. 272
Downsampled STFT Filter Banks ......................................................................... 273
Downsampled STFT Filter Bank .................................................................. 273
8.8.1.1 Filter Bank Reconstruction .................................................................... 274
Downsampling with Anti-Aliasing ................................................................ 275
8.8.2.1 Properly Anti-Aliasing Window Transforms ......................................... 276
8.8.2.2 Hop Sizes for WOLA ............................................................................ 277
Constant-Overlap-Add (COLA) Cases ......................................................... 278
8.8.3.1 Hamming Overlap-Add Example .......................................................... 278
8.8.3.2 Periodic-Hamming OLA from Poisson Summation Formula ............... 280
8.8.3.3 Kaiser Overlap-Add Example ............................................................... 283
STFT with Modifications ...................................................................................... 287
FBS Fixed Modifications .............................................................................. 287
Time Varying Modifications in FBS ............................................................. 289
STFT Summary and Conclusions ......................................................................... 290
8.10.1 Overlap-Add .................................................................................................. 291
8.10.2
Filter Bank Summation ................................................................................. 292
Applications of the STFT ...................................................................................... 294
Fundamental Frequency Estimation from Sinusoidal Peaks ................................. 294
Useful Preprocessing ..................................................................................... 295
Getting Closer to Maximum Likelihood ....................................................... 295
8.9.1
8.9.2
Chapter 9
References on
Estimation ....................................................................... 296
Cross Synthesis ..................................................................................................... 296
Spectral Envelope Extraction ................................................................................ 297
Cepstral Windowing ...................................................................................... 297
Linear Prediction Spectral Envelope ............................................................. 298
9.3.2.1 Linear Prediction is Peak Sensitive ....................................................... 300
9.3.2.2 Linear Prediction Methods .................................................................... 300
9.3.2.3 Computation of Linear Prediction Coefficients .................................... 301
9.3.2.4 Linear Prediction Order Selection ......................................................... 302
9.3.2.5 Summary of LP Spectral Envelopes ...................................................... 302
Spectral Envelope Examples ......................................................................... 303
9.3.3.1 Signal Synthesis .................................................................................... 303
Sinusoidal Modeling of Sound .............................................................................. 313
Additive Synthesis Overview ................................................................................ 314
Additive Synthesis Analysis .................................................................................. 316
Following Spectral Peaks .............................................................................. 316
Sinusoidal Peak Finding ................................................................................ 317
Tracking Sinusoidal Peaks in a Sequence of FFTs ........................................ 319
Sines + Noise Modeling ........................................................................................ 321
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.1.1
9.1.2
9.1.3
9.3.1
9.3.2
9.3.3
9.6.1
9.6.2
9.6.3
8