FrontMatter.pdf
Preface
Contents
List of Contributors
chapter1 Introduction.pdf
1 Introduction
1.1 Background
1.2 Effects of Reverberation
1.3 Speech Acquisition
1.4 System Description
1.5 Acoustic Impulse Responses
1.6 Literature Overview
1.6.1 Beamforming Using Microphone Arrays
1.6.2 Speech Enhancement Approaches to Dereverberation
1.6.3 Blind System Identification and Inversion
1.6.3.1 Blind System Identification
1.6.3.2 Inverse Filtering
1.7 Outline of the Book
References
chapter2 Models, Measurement and Evaluation.pdf
2 Models, Measurement and Evaluation
2.1 An Overview of Room Acoustics
2.1.1 The Wave Equation
2.1.2 Sound Field in a Reverberant Room
2.1.3 Reverberation Time
2.1.4 The Critical Distance
2.1.5 Analysis of Room Acoustics Dependent on Frequency Range
2.2 Models of Room Reverberation
2.2.1 Intuitive Model
2.2.2 Finite Element Models
2.2.3 Digital Waveguide Mesh
2.2.4 Ray-tracing
2.2.5 Source-image Model
2.2.6 Statistical Room Acoustics
2.3 Subjective Evaluation
2.4 Channel-based Objective Measures
2.4.1 Normalized Projection Misalignment
2.4.2 Direct-to-reverberant Ratio
2.4.3 Early-to-total Sound Energy Ratio
2.4.4 Early-to-late Reverberation Ratio
2.5 Signal-based Objective Measures
2.5.1 Log Spectral Distortion
2.5.2 Bark Spectral Distortion
2.5.3 Reverberation Decay Tail
2.5.4 Signal-to-reverberant Ratio
2.5.4.1 Relationship Between DRR and SRR
2.5.4.2 Level Normalization in SRR
2.5.4.3 SRR Computation Example
2.5.4.4 SRR Summary
2.5.5 Experimental Comparisons
2.6 Dereverberation Performance of the Delay-and-sum Beamformer
2.6.1 Simulation Results: DSB Performance
Experiment 1: Effect of Source-microphone Distance
Experiment 2: Effect of Number of Microphones
2.7 Summary and Discussion
References
chapter3 Speech Dereverberation Using Statistical Reverberation Models.pdf
3 Speech Dereverberation Using Statistical Reverberation Models
3.1 Introduction
3.2 Review of Dereverberation Methods
3.2.1 Reverberation Cancellation
3.2.2 Reverberation Suppression
3.3 Statistical Reverberation Models
3.3.1 Polack’s Statistical Model
3.3.2 Generalized Statistical Model
3.4 Single-microphone Spectral Enhancement
3.4.1 Problem Formulation
3.4.2 MMSE Log-spectral Amplitude Estimator
3.4.3 a priori SIR Estimator
3.5 Multi-microphone Spectral Enhancement
3.5.1 Problem Formulation
3.5.2 Two Multi-microphone Systems
3.5.2.1 MVDR Beamformer and Single-channel MMSE Estimator
3.5.2.2 Non-linear Spatial Processor
3.5.3 Speech Presence Probability Estimator
3.6 Late Reverberant Spectral Variance Estimator
3.7 Estimating Model Parameters
3.7.1 Reverberation Time
3.7.2 Direct-to-reverberant Ratio
3.8 Experimental Results
3.8.1 Using One Microphone
3.8.2 Using Multiple Microphones
3.9 Summary and Outlook
Acknowledgment
References
chapter4 Dereverberation Using LPC-based Approaches.pdf
4 Dereverberation Using LPC-based Approaches
4.1 Introduction
4.2 Linear Predictive Coding of Speech
4.3 LPC on Reverberant Speech
4.3.1 Effects of Reverberation on the LPC Coefficients
4.3.1.1 Single Microphone
4.3.1.2 JointMultichannel Optimization
4.3.1.3 LPC at the Output of a Delay-and-sum Beamformer
4.3.2 Effects of Reverberation on the Prediction Residual
4.3.3 Simulation Examples for LPC on Reverberant Speech
4.4 Dereverberation Employing LPC
4.4.1 Regional Weighting Function
4.4.2 Weighting Function Based on Hilbert Envelopes
4.4.3 Wavelet Extrema Clustering
4.4.4 Weight Function from Coarse Channel Estimates
4.4.5 Kurtosis Maximizing Adaptive Filter
4.5 Spatiotemporal Averaging Method for Enhancement of Reverberant Speech
4.5.1 Larynx Cycle Segmentation with Multichannel DYPSA
4.5.2 Time Delay of Arrival Estimation for Spatial Averaging
4.5.3 Voiced/Unvoiced/Silence Detection
4.5.4 Weighted Inter-cycle Averaging
4.5.5 Dereverberation Results
4.6 Summary
Appendix A
References
chapter5 Multi-microphone Speech Dereverberation Using Eigen-decomposition.pdf
5 Multi-microphone Speech Dereverberation Using Eigen-decomposition
5.1 Introduction
5.2 Problem Formulation
5.3 Preliminaries
5.4 AIR Estimation – Algorithm Derivation
5.5 Extensions of the Basic Algorithm
5.5.1 Two-microphone Noisy Case
5.5.1.1 White Noise Case
5.5.1.2 Colored Noise Case
5.5.2 Multi-microphone Case (M > 2)
5.5.3 Partial Knowledge of the Null Subspace
5.6 AIR Estimation in Subbands
5.7 Signal Reconstruction
5.8 Experimental Study
5.8.1 Full-band Version – Results
5.8.2 Subband Version – Results
5.9 Limitations of the Proposed Algorithms and Possible Remedies
5.9.1 Noise Robustness
5.9.2 Computational Complexity and Memory Requirements
5.9.3 Common Zeros
5.9.4 The Demand for the Entire AIR Compensation
5.9.5 Filter-bank Design
5.9.6 Gain Ambiguity
5.10 Summary and Conclusions
References
chapter 6 Adaptive Blind Multichannel System Identification.pdf
6 Adaptive Blind Multichannel System Identification
6.1 Introduction
6.2 Problem Formulation
6.2.1 Channel Identifiability Conditions
6.3 Review of Adaptive Algorithms for Acoustic BSI Employing Cross-relations
6.3.1 The Multichannel Least Mean Squares Algorithm
6.3.2 The Normalized Multichannel Frequency Domain LMS Algorithm
6.3.3 The Improved Proportionate NMCFLMS Algorithm
6.4 Effect of Noise on the NMCFLMS Algorithm – The Misconvergence Problem
6.5 The Constraint Based ext-NMCFLMS Algorithm
6.5.1 Effect of Noise on the Cost Function
6.5.2 Penalty Term Using the Direct-path Constraint
6.5.3 Delay Estimation
6.5.4 Flattening Point Estimation
6.6 Simulation Results
6.6.1 Experimental Setup
6.6.2 Variation of Convergence rate on β
6.6.3 Degradation Due to Direct-path Estimation
6.6.4 Comparison of Algorithm Performance Using a WGN Input Signal
6.6.5 Comparison of Algorithm Performance Using Speech Input Signals
6.7 Conclusions
References
chapter 7 Subband Inversion of Multichannel Acoustic Systems.pdf
7 Subband Inversion of Multichannel Acoustic Systems
7.1 Introduction
7.2 Multichannel Equalization
7.3 Equalization with Inexact Impulse Responses
7.3.1 Effects of System Mismatch
7.3.2 Effects of System Length
7.4 Subband Multichannel Equalization
7.4.1 Oversampled Filter-banks
7.4.2 Subband Decomposition
7.4.3 Subband Multichannel Equalization
7.5 Computational Complexity
7.6 Application to Speech Dereverberation
7.7 Simulations and Results
7.7.1 Experiment 1: Complex Subband Decomposition
7.7.2 Experiment 2: Random Channels
7.7.3 Experiment 3: Simulated Room Impulse Responses
7.7.4 Experiment 4: Speech Dereverberation
7.8 Summary
References
chapter8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker.pdf
8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker
8.1 Introduction and Overview
8.1.1 Model-based Framework
8.1.1.1 Online vs. Offline Numerical Methods
8.1.1.2 Parametric Estimation and Optimal Filtering methods
8.1.2 Practical Blind Dereverberation Scenarios
8.1.2.1 Single-sensor Applications
8.1.2.2 Time-varying Acoustic Channels
8.1.3 Chapter Organisation
8.2 Mathematical Problem Formulation
8.2.1 Bayesian Framework for Blind Dereverberation
8.2.2 Classification of Blind Dereverberation Formulations
8.2.3 Numerical Bayesian Methods
8.2.3.1 Markov Chain Monte Carlo
8.2.3.2 Sequential Monte Carlo
8.2.3.3 General Comments
8.2.4 Identifiability
8.3 Nature of Room Acoustics
8.3.1 Regions of the Audible Spectrum
8.3.2 The Room Transfer Function
8.3.3 Issues with Modelling Room Transfer Functions
Long and Non-minimum Phase AIRs
Robustness to Estimation Error and Variation of Inverse of the AIR
Subband and Frequency-zooming Solu
8.4 Parametric Channel Models
8.4.1 Pole-zero and All-zero Models
8.4.2 The Common-acoustical Pole and Zero Model
8.4.3 The All-pole Model
8.4.4 Subband All-pole Modelling
8.4.5 The Nature of Time-varying All-pole Models
8.4.6 Static Modelling of TVAP Parameters
8.4.7 Stochastic Modelling of Acoustic Channels
8.5 Noise and System Model
8.6 Source Model
8.6.1 Speech Production
8.6.2 Time-varying AR Modelling of Unvoiced Speech
8.6.2.1 Statistical Nature of Speech Parameter Variation
8.6.3 Static Block-based Modelling of TVAR Parameters
8.6.3.1 Basis Function Representation
8.6.3.2 Choice of Basis Functions
8.6.3.3 Block-based Time-varying Approach
8.6.4 Stochastic Modelling of TVAR Parameters
8.7 Bayesian Blind Dereverberation Algorithms
8.7.1 Offline Processing Using MCMC
8.7.1.1 Likelihood for Source Signal
8.7.1.2 Complete Likelihood for Observations
8.7.1.3 Prior Distributions of Source, Channel and Error Residual
8.7.1.4 Posterior Distribution of the Channel Parameters
8.7.1.5 Experimental Results
8.7.2 Online Processing Using Sequential Monte Carlo
8.7.2.1 Source and Channel Model
8.7.2.2 Conditionally Gaussian State Space
8.7.2.3 Methodology
8.7.2.4 Channel Estimation Using Bayesian Channel Updates
8.7.2.5 Experimental Results
8.7.3 Comparison of Offline and Online Approaches
8.8 Conclusions
References
chapter 9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information.pdf
9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information
9.1 Introduction
9.2 Inverse Filtering for Speech Dereverberation
9.2.1 Speech Capture Model with Multiple Microphones
9.2.2 Optimal Inverse Filtering
9.2.3 Unsupervised Algorithm to Approximate Optimal Processing
9.3 Approaches to Solving the Over-whitening of the Recovered Speech
9.3.1 Precise Compensation for Over-whitening of Target Speech
9.3.1.1 Principle
9.3.1.2 Close to Perfect Dereverberation
9.3.1.3 Dereverberation and Coherent Noise Reduction
9.3.1.4 Sensitivity to Incoherent N
9.3.2 Late Reflection Removal with Multichannel Multistep LP
9.3.2.1 Principle
9.3.2.2 Speech Dereverberation Performance in Terms of ASR Score
9.3.2.3 Speech Dereverberation in a Noisy Environment
9.3.2.4 Dereverberation of Multiple Sound Source Signals
9.3.3 Joint Estimation of Linear Predictors and Short-time Speech Characteristics
9.3.3.1 Background
9.3.3.2 Principle
9.3.3.3 Algorithms
9.3.4 Probabilistic Model Based Speech Dereverberation
9.3.4.1 Probabilistic Speech Model
9.3.4.2 Likelihood Function for Multichannel LP
9.3.4.3 Autocorrelation Codebook-based Speech Dereverberation
9.4 Concluding Remarks
Appendix A
References
chapter 10 TRINICON for Dereverberation of Speech and Audio Signals.pdf
10 TRINICON for Dereverberation of Speech and Audio Signals
10.1 Introduction
10.1.1 Generic Tasks for Blind Adaptive MIMO Filtering
10.1.2 A Compact Matrix Formulation for MIMO Filtering Problems
10.1.3 Overview of this Chapter
10.2 Ideal Inversion Solution and the Direct-inverse Approach to Blind Deconvolution
10.3 Ideal Solution of Direct Adaptive Filtering Problems and the Identification-and-inversion Approach to Blind Deconvolution
10.3.1 Ideal Separation Solution for Two Sources and Two Sensors
10.3.2 Relation to MIMO and SIMO System Identification
10.3.3 Ideal Separation Solution and Optimum Separation Filter Length for an Arbitrary Number of Sources and Sensors
10.3.4 General Scheme for Blind System Identification
10.3.5 Application of Blind System Identification to Blind Deconvolution
10.4 TRINICON – A General Framework for Adaptive MIMO Signal Processing and Application to Blind Adaptation Problems
10.4.1 Matrix Notation for Convolutive Mixtures
10.4.2 Optimization Criterion
10.4.3 Gradient-based Coefficient Update
10.4.3.1 Alternative Formulation of the Gradient-based Coefficient Update
10.4.4 Natural Gradient-based Coefficient Update
10.4.5 Incorporation of Stochastic Source Models
10.4.5.1 Spherically Invariant Random Processes as Signal Model
10.4.5.2 Multivariate Gaussians as Signal Model: Second-order Statistics
10.4.5.3 Nearly Gaussian Densities as Signal Model
10.5 Application of TRINICON to Blind System Identification and the Identification-and-inversion Approach to Blind Deconvolution
10.5.1 Generic Gradient-based Algorithm for Direct Adaptive Filtering Problems
10.5.1.1 Illustration for Second-order Statistics
10.5.2 Realizations for the SIMO Case
10.5.2.1 Coefficient Initialization
10.5.2.2 Efficient Implementation of the Sylvester Constraint for the Special Case of SIMO Models
10.5.3 Efficient Frequency-domain Realizations for the MIMO Case
10.6 Application of TRINICON to the Direct-inverse Approach to Blind Deconvolution
10.6.1 Multichannel Blind Deconvolution
10.6.2 Multichannel Blind Partial Deconvolution
10.6.3 Special Cases and Links to Known Algoritms
10.6.3.1 SIMO vs. MIMO Mixing Systems
10.6.3.2 Efficient Implementation Using the CorrelationMethod
10.6.3.3 Relations to Some Known HOS Approaches
10.6.3.4 Relations to Some Known SOS Approaches
10.7 Experiments
10.7.1 The SIMO Case
10.7.2 The MIMO Case
10.8 Conclusions
Appendix A: Compact Derivation of the Gradient-based Coefficient Update
Appendix B: Transformation of the Multivariate Output Signal PDF in (10.39) by Blockwise Sylvester Matrix
Appendix C: Polynomial Expansions for Nearly Gaussian Probability Densities
Appendix D: Expansion of the Sylvester Constraints in (10.83)
References