logo资料库

Speech Dereverberation.pdf

第1页 / 共396页
第2页 / 共396页
第3页 / 共396页
第4页 / 共396页
第5页 / 共396页
第6页 / 共396页
第7页 / 共396页
第8页 / 共396页
资料共396页,剩余部分请下载后查看
FrontMatter.pdf
Preface
Contents
List of Contributors
chapter1 Introduction.pdf
1 Introduction
1.1 Background
1.2 Effects of Reverberation
1.3 Speech Acquisition
1.4 System Description
1.5 Acoustic Impulse Responses
1.6 Literature Overview
1.6.1 Beamforming Using Microphone Arrays
1.6.2 Speech Enhancement Approaches to Dereverberation
1.6.3 Blind System Identification and Inversion
1.6.3.1 Blind System Identification
1.6.3.2 Inverse Filtering
1.7 Outline of the Book
References
chapter2 Models, Measurement and Evaluation.pdf
2 Models, Measurement and Evaluation
2.1 An Overview of Room Acoustics
2.1.1 The Wave Equation
2.1.2 Sound Field in a Reverberant Room
2.1.3 Reverberation Time
2.1.4 The Critical Distance
2.1.5 Analysis of Room Acoustics Dependent on Frequency Range
2.2 Models of Room Reverberation
2.2.1 Intuitive Model
2.2.2 Finite Element Models
2.2.3 Digital Waveguide Mesh
2.2.4 Ray-tracing
2.2.5 Source-image Model
2.2.6 Statistical Room Acoustics
2.3 Subjective Evaluation
2.4 Channel-based Objective Measures
2.4.1 Normalized Projection Misalignment
2.4.2 Direct-to-reverberant Ratio
2.4.3 Early-to-total Sound Energy Ratio
2.4.4 Early-to-late Reverberation Ratio
2.5 Signal-based Objective Measures
2.5.1 Log Spectral Distortion
2.5.2 Bark Spectral Distortion
2.5.3 Reverberation Decay Tail
2.5.4 Signal-to-reverberant Ratio
2.5.4.1 Relationship Between DRR and SRR
2.5.4.2 Level Normalization in SRR
2.5.4.3 SRR Computation Example
2.5.4.4 SRR Summary
2.5.5 Experimental Comparisons
2.6 Dereverberation Performance of the Delay-and-sum Beamformer
2.6.1 Simulation Results: DSB Performance
Experiment 1: Effect of Source-microphone Distance
Experiment 2: Effect of Number of Microphones
2.7 Summary and Discussion
References
chapter3 Speech Dereverberation Using Statistical Reverberation Models.pdf
3 Speech Dereverberation Using Statistical Reverberation Models
3.1 Introduction
3.2 Review of Dereverberation Methods
3.2.1 Reverberation Cancellation
3.2.2 Reverberation Suppression
3.3 Statistical Reverberation Models
3.3.1 Polack’s Statistical Model
3.3.2 Generalized Statistical Model
3.4 Single-microphone Spectral Enhancement
3.4.1 Problem Formulation
3.4.2 MMSE Log-spectral Amplitude Estimator
3.4.3 a priori SIR Estimator
3.5 Multi-microphone Spectral Enhancement
3.5.1 Problem Formulation
3.5.2 Two Multi-microphone Systems
3.5.2.1 MVDR Beamformer and Single-channel MMSE Estimator
3.5.2.2 Non-linear Spatial Processor
3.5.3 Speech Presence Probability Estimator
3.6 Late Reverberant Spectral Variance Estimator
3.7 Estimating Model Parameters
3.7.1 Reverberation Time
3.7.2 Direct-to-reverberant Ratio
3.8 Experimental Results
3.8.1 Using One Microphone
3.8.2 Using Multiple Microphones
3.9 Summary and Outlook
Acknowledgment
References
chapter4 Dereverberation Using LPC-based Approaches.pdf
4 Dereverberation Using LPC-based Approaches
4.1 Introduction
4.2 Linear Predictive Coding of Speech
4.3 LPC on Reverberant Speech
4.3.1 Effects of Reverberation on the LPC Coefficients
4.3.1.1 Single Microphone
4.3.1.2 JointMultichannel Optimization
4.3.1.3 LPC at the Output of a Delay-and-sum Beamformer
4.3.2 Effects of Reverberation on the Prediction Residual
4.3.3 Simulation Examples for LPC on Reverberant Speech
4.4 Dereverberation Employing LPC
4.4.1 Regional Weighting Function
4.4.2 Weighting Function Based on Hilbert Envelopes
4.4.3 Wavelet Extrema Clustering
4.4.4 Weight Function from Coarse Channel Estimates
4.4.5 Kurtosis Maximizing Adaptive Filter
4.5 Spatiotemporal Averaging Method for Enhancement of Reverberant Speech
4.5.1 Larynx Cycle Segmentation with Multichannel DYPSA
4.5.2 Time Delay of Arrival Estimation for Spatial Averaging
4.5.3 Voiced/Unvoiced/Silence Detection
4.5.4 Weighted Inter-cycle Averaging
4.5.5 Dereverberation Results
4.6 Summary
Appendix A
References
chapter5 Multi-microphone Speech Dereverberation Using Eigen-decomposition.pdf
5 Multi-microphone Speech Dereverberation Using Eigen-decomposition
5.1 Introduction
5.2 Problem Formulation
5.3 Preliminaries
5.4 AIR Estimation – Algorithm Derivation
5.5 Extensions of the Basic Algorithm
5.5.1 Two-microphone Noisy Case
5.5.1.1 White Noise Case
5.5.1.2 Colored Noise Case
5.5.2 Multi-microphone Case (M > 2)
5.5.3 Partial Knowledge of the Null Subspace
5.6 AIR Estimation in Subbands
5.7 Signal Reconstruction
5.8 Experimental Study
5.8.1 Full-band Version – Results
5.8.2 Subband Version – Results
5.9 Limitations of the Proposed Algorithms and Possible Remedies
5.9.1 Noise Robustness
5.9.2 Computational Complexity and Memory Requirements
5.9.3 Common Zeros
5.9.4 The Demand for the Entire AIR Compensation
5.9.5 Filter-bank Design
5.9.6 Gain Ambiguity
5.10 Summary and Conclusions
References
chapter 6 Adaptive Blind Multichannel System Identification.pdf
6 Adaptive Blind Multichannel System Identification
6.1 Introduction
6.2 Problem Formulation
6.2.1 Channel Identifiability Conditions
6.3 Review of Adaptive Algorithms for Acoustic BSI Employing Cross-relations
6.3.1 The Multichannel Least Mean Squares Algorithm
6.3.2 The Normalized Multichannel Frequency Domain LMS Algorithm
6.3.3 The Improved Proportionate NMCFLMS Algorithm
6.4 Effect of Noise on the NMCFLMS Algorithm – The Misconvergence Problem
6.5 The Constraint Based ext-NMCFLMS Algorithm
6.5.1 Effect of Noise on the Cost Function
6.5.2 Penalty Term Using the Direct-path Constraint
6.5.3 Delay Estimation
6.5.4 Flattening Point Estimation
6.6 Simulation Results
6.6.1 Experimental Setup
6.6.2 Variation of Convergence rate on β
6.6.3 Degradation Due to Direct-path Estimation
6.6.4 Comparison of Algorithm Performance Using a WGN Input Signal
6.6.5 Comparison of Algorithm Performance Using Speech Input Signals
6.7 Conclusions
References
chapter 7 Subband Inversion of Multichannel Acoustic Systems.pdf
7 Subband Inversion of Multichannel Acoustic Systems
7.1 Introduction
7.2 Multichannel Equalization
7.3 Equalization with Inexact Impulse Responses
7.3.1 Effects of System Mismatch
7.3.2 Effects of System Length
7.4 Subband Multichannel Equalization
7.4.1 Oversampled Filter-banks
7.4.2 Subband Decomposition
7.4.3 Subband Multichannel Equalization
7.5 Computational Complexity
7.6 Application to Speech Dereverberation
7.7 Simulations and Results
7.7.1 Experiment 1: Complex Subband Decomposition
7.7.2 Experiment 2: Random Channels
7.7.3 Experiment 3: Simulated Room Impulse Responses
7.7.4 Experiment 4: Speech Dereverberation
7.8 Summary
References
chapter8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker.pdf
8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker
8.1 Introduction and Overview
8.1.1 Model-based Framework
8.1.1.1 Online vs. Offline Numerical Methods
8.1.1.2 Parametric Estimation and Optimal Filtering methods
8.1.2 Practical Blind Dereverberation Scenarios
8.1.2.1 Single-sensor Applications
8.1.2.2 Time-varying Acoustic Channels
8.1.3 Chapter Organisation
8.2 Mathematical Problem Formulation
8.2.1 Bayesian Framework for Blind Dereverberation
8.2.2 Classification of Blind Dereverberation Formulations
8.2.3 Numerical Bayesian Methods
8.2.3.1 Markov Chain Monte Carlo
8.2.3.2 Sequential Monte Carlo
8.2.3.3 General Comments
8.2.4 Identifiability
8.3 Nature of Room Acoustics
8.3.1 Regions of the Audible Spectrum
8.3.2 The Room Transfer Function
8.3.3 Issues with Modelling Room Transfer Functions
Long and Non-minimum Phase AIRs
Robustness to Estimation Error and Variation of Inverse of the AIR
Subband and Frequency-zooming Solu
8.4 Parametric Channel Models
8.4.1 Pole-zero and All-zero Models
8.4.2 The Common-acoustical Pole and Zero Model
8.4.3 The All-pole Model
8.4.4 Subband All-pole Modelling
8.4.5 The Nature of Time-varying All-pole Models
8.4.6 Static Modelling of TVAP Parameters
8.4.7 Stochastic Modelling of Acoustic Channels
8.5 Noise and System Model
8.6 Source Model
8.6.1 Speech Production
8.6.2 Time-varying AR Modelling of Unvoiced Speech
8.6.2.1 Statistical Nature of Speech Parameter Variation
8.6.3 Static Block-based Modelling of TVAR Parameters
8.6.3.1 Basis Function Representation
8.6.3.2 Choice of Basis Functions
8.6.3.3 Block-based Time-varying Approach
8.6.4 Stochastic Modelling of TVAR Parameters
8.7 Bayesian Blind Dereverberation Algorithms
8.7.1 Offline Processing Using MCMC
8.7.1.1 Likelihood for Source Signal
8.7.1.2 Complete Likelihood for Observations
8.7.1.3 Prior Distributions of Source, Channel and Error Residual
8.7.1.4 Posterior Distribution of the Channel Parameters
8.7.1.5 Experimental Results
8.7.2 Online Processing Using Sequential Monte Carlo
8.7.2.1 Source and Channel Model
8.7.2.2 Conditionally Gaussian State Space
8.7.2.3 Methodology
8.7.2.4 Channel Estimation Using Bayesian Channel Updates
8.7.2.5 Experimental Results
8.7.3 Comparison of Offline and Online Approaches
8.8 Conclusions
References
chapter 9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information.pdf
9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information
9.1 Introduction
9.2 Inverse Filtering for Speech Dereverberation
9.2.1 Speech Capture Model with Multiple Microphones
9.2.2 Optimal Inverse Filtering
9.2.3 Unsupervised Algorithm to Approximate Optimal Processing
9.3 Approaches to Solving the Over-whitening of the Recovered Speech
9.3.1 Precise Compensation for Over-whitening of Target Speech
9.3.1.1 Principle
9.3.1.2 Close to Perfect Dereverberation
9.3.1.3 Dereverberation and Coherent Noise Reduction
9.3.1.4 Sensitivity to Incoherent N
9.3.2 Late Reflection Removal with Multichannel Multistep LP
9.3.2.1 Principle
9.3.2.2 Speech Dereverberation Performance in Terms of ASR Score
9.3.2.3 Speech Dereverberation in a Noisy Environment
9.3.2.4 Dereverberation of Multiple Sound Source Signals
9.3.3 Joint Estimation of Linear Predictors and Short-time Speech Characteristics
9.3.3.1 Background
9.3.3.2 Principle
9.3.3.3 Algorithms
9.3.4 Probabilistic Model Based Speech Dereverberation
9.3.4.1 Probabilistic Speech Model
9.3.4.2 Likelihood Function for Multichannel LP
9.3.4.3 Autocorrelation Codebook-based Speech Dereverberation
9.4 Concluding Remarks
Appendix A
References
chapter 10 TRINICON for Dereverberation of Speech and Audio Signals.pdf
10 TRINICON for Dereverberation of Speech and Audio Signals
10.1 Introduction
10.1.1 Generic Tasks for Blind Adaptive MIMO Filtering
10.1.2 A Compact Matrix Formulation for MIMO Filtering Problems
10.1.3 Overview of this Chapter
10.2 Ideal Inversion Solution and the Direct-inverse Approach to Blind Deconvolution
10.3 Ideal Solution of Direct Adaptive Filtering Problems and the Identification-and-inversion Approach to Blind Deconvolution
10.3.1 Ideal Separation Solution for Two Sources and Two Sensors
10.3.2 Relation to MIMO and SIMO System Identification
10.3.3 Ideal Separation Solution and Optimum Separation Filter Length for an Arbitrary Number of Sources and Sensors
10.3.4 General Scheme for Blind System Identification
10.3.5 Application of Blind System Identification to Blind Deconvolution
10.4 TRINICON – A General Framework for Adaptive MIMO Signal Processing and Application to Blind Adaptation Problems
10.4.1 Matrix Notation for Convolutive Mixtures
10.4.2 Optimization Criterion
10.4.3 Gradient-based Coefficient Update
10.4.3.1 Alternative Formulation of the Gradient-based Coefficient Update
10.4.4 Natural Gradient-based Coefficient Update
10.4.5 Incorporation of Stochastic Source Models
10.4.5.1 Spherically Invariant Random Processes as Signal Model
10.4.5.2 Multivariate Gaussians as Signal Model: Second-order Statistics
10.4.5.3 Nearly Gaussian Densities as Signal Model
10.5 Application of TRINICON to Blind System Identification and the Identification-and-inversion Approach to Blind Deconvolution
10.5.1 Generic Gradient-based Algorithm for Direct Adaptive Filtering Problems
10.5.1.1 Illustration for Second-order Statistics
10.5.2 Realizations for the SIMO Case
10.5.2.1 Coefficient Initialization
10.5.2.2 Efficient Implementation of the Sylvester Constraint for the Special Case of SIMO Models
10.5.3 Efficient Frequency-domain Realizations for the MIMO Case
10.6 Application of TRINICON to the Direct-inverse Approach to Blind Deconvolution
10.6.1 Multichannel Blind Deconvolution
10.6.2 Multichannel Blind Partial Deconvolution
10.6.3 Special Cases and Links to Known Algoritms
10.6.3.1 SIMO vs. MIMO Mixing Systems
10.6.3.2 Efficient Implementation Using the CorrelationMethod
10.6.3.3 Relations to Some Known HOS Approaches
10.6.3.4 Relations to Some Known SOS Approaches
10.7 Experiments
10.7.1 The SIMO Case
10.7.2 The MIMO Case
10.8 Conclusions
Appendix A: Compact Derivation of the Gradient-based Coefficient Update
Appendix B: Transformation of the Multivariate Output Signal PDF in (10.39) by Blockwise Sylvester Matrix
Appendix C: Polynomial Expansions for Nearly Gaussian Probability Densities
Appendix D: Expansion of the Sylvester Constraints in (10.83)
References
Signals and Communication Technology
Patrick A. Naylor · Nikolay D. Gaubitch Editors Speech Dereverberation 123
Patrick A. Naylor, PhD Nikolay D. Gaubitch, PhD Department of Electrical and Electronic Engineering Imperial College London Exhibition Road London SW7 2AZ United Kingdom p.naylor@imperial.ac.uk ndg@imperial.ac.uk e-ISBN 978-1-84996-056-4 ISSN 1860-4862 ISBN 978-1-84996-055-7 DOI 10.1007/978-1-84996-056-4 Springer London Dordrecht Heidelberg New York British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: 2010930018 © Springer-Verlag London Limited 2010 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be re- produced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the infor- mation contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Cover design: WMXDesign, Heidelberg, Germany Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
This book owes its existence to the numerous students and co-workers who, over the years, have worked with me and inspired me. I gratefully acknowledge their contributions. I dedicate this book to my wife Catharine who has taught me so much. To my mother. Patrick A. Naylor Nikolay D. Gaubitch
Preface Speech dereverberation has been on the agenda of the signal processing community for several years. It is only in the last decade, however, that the topic has really taken off, as seen from the growing number of publications appearing in the journals and at conferences. One of the reasons that the topic has become more popular is the rapidly growing availability in the marketplace of computationally capable mobile devices, such as phones, PDAs and laptop computers, for which hands-free (distant talking) operation is desirable. This is all the more significant when seen in the context of the confluence of computing and communication terminals exploiting low-cost VoIP-enabled telephony applications. Additionally, it is also true to say that user expectations of computing and communication devices is a strongly increasing function with time, perhaps only moderated by considerations of value versus cost – people are more forgiving of technology limitations if they are not paying (much) for the service they are employing. Factors such as these have combined to motivate the signal processing community to provide robust solutions for speech enhancement in general and to work on in particular, what for many is a new task, dereverberation. Since we began our research in this field, we have been receiving inquiries from curious researchers seeking a digestible review on the state-of-the-art in the field of speech dereverberation. Until now, the answer has always been that, although there have been several books that treat the subject of speech processing, microphone ar- ray processing, and audio processing, which have included chapters on speech dere- verberation, there has not been a publication that gives a comprehensive overview of the topic. We believe that the field has now reached a maturity that allows the compilation of such a book, solely dedicated to the topic of speech dereverberation. It was this belief and the context of the situation that motivated our initiative in this writing project. Before you decide to skip the rest of this Preface on the grounds that its au- thors have lost their grip on reality, let us momentarily clarify the level of matu- rity to which we are referring. The three main axes of speech enhancement were highlighted by Walter Kellermann at the 1999 International Workshop on Acoustic Echo and Noise Control to be echo cancellation, noise reduction and dereverbera- tion. Of these three it is a likely consensus that dereverberation is the more difficult vii
viii Preface task. Modelling of room acoustics is more complicated than either the modelling of speech production or of noise generation processes and their additive combina- tion with speech signals, at least in the manner in which such models are currently applied in DSP algorithm development. Dereverberation is also normally formu- lated as a blind (or unsupervised) problem, somewhat related to, but nevertheless distinct from, blind source separation. Computational limitations both in power and precision also present real challenges in this field. So, it is inevitable that, given the difficulty of the problem and the fact that attention on this problem has not been strongly focused for as long as it has on either echo cancellation or noise reduction, the level of maturity in the understanding of the dereverberation problem and its solutions is far below that of the other related problems. At this stage of dereverber- ation technology, we could argue that there are more open questions than solutions; those solutions that are available strive towards, but do not always achieve, the levels of robustness found in many of the more mature technologies. This book, therefore, by no means offers any ultimate solutions to the speech dereverberation problem. Nonetheless, it aims to provide an in-depth overview of the state-of-the-art in speech dereverberation methods with contributing chapters from some of the key researchers in the field. It also gives what we believe to be a valuable introduction to some topics relevant to dereverberation, such as room acoustics and psychoacoustics, though we have not aimed to cover these subjects in detail. The book is aimed at researchers and graduate students who would like to pursue research in this field, giving an accessible introduction to the topic including nu- merous references to other publications. It could make an excellent complementary text for postgraduate courses in speech processing. However, it is not exclusively limited to this group of readers. The attempt to solve the very difficult problem of speech dereverberation has involved a large variety of signal processing tools. Such tools include multirate signal processing, adaptive filtering, Bayesian inference, and linear prediction, to mention but a few. The applications of these techniques can be found useful in other fields of engineering. We would like to thank all the contributing authors for their excellent chapters. We would also like to express our special thanks to Dr. Emanu¨el Habets for his care- fully reading of our drafts and for his helpful discussions and contributions through- out this project. London, February 2009 Patrick A. Naylor Nikolay D. Gaubitch
Contents 1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick A. Naylor and Nikolay D. Gaubitch 1.1 1.2 1.3 1.4 1.5 1.6 1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Effects of Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Speech Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Acoustic Impulse Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Literature Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beamforming Using Microphone Arrays . . . . . . . . . . . . . . . 1.6.1 8 Speech Enhancement Approaches to Dereverberation . . . . 10 1.6.2 1.6.3 Blind System Identification and Inversion . . . . . . . . . . . . . . 11 1.7 Outline of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Models, Measurement and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Patrick A. Naylor, Emanu¨el A.P. Habets, Jimi Y.-C. Wen, and Nikolay D. Gaubitch 2.1 An Overview of Room Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.1 2.1.2 Sound Field in a Reverberant Room . . . . . . . . . . . . . . . . . . . 23 Reverberation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.3 The Critical Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.4 Analysis of Room Acoustics Dependent on Frequency 2.1.5 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2 Models of Room Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Intuitive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Finite Element Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Digital Waveguide Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Ray-tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Source-image Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Statistical Room Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.2.6 ix
x 3 Contents 2.5 2.3 2.4 Subjective Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Channel-based Objective Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Normalized Projection Misalignment . . . . . . . . . . . . . . . . . . 37 2.4.1 Direct-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 2.4.3 Early-to-total Sound Energy Ratio . . . . . . . . . . . . . . . . . . . . 38 2.4.4 Early-to-late Reverberation Ratio . . . . . . . . . . . . . . . . . . . . . 39 Signal-based Objective Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Log Spectral Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.5.1 2.5.2 Bark Spectral Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Reverberation Decay Tail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.5.3 Signal-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.5.4 2.5.5 Experimental Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Dereverberation Performance of the Delay-and-sum Beamformer . . 50 2.6.1 Simulation Results: DSB Performance . . . . . . . . . . . . . . . . . 51 2.7 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.6 3.3 Speech Dereverberation Using Statistical Reverberation Models . . . . . 57 Emanu¨el A.P. Habets 3.1 3.2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Review of Dereverberation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.1 Reverberation Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.2 Reverberation Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Statistical Reverberation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Polack’s Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 3.3.2 Generalized Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . 63 Single-microphone Spectral Enhancement . . . . . . . . . . . . . . . . . . . . . 64 3.4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.4.2 MMSE Log-spectral Amplitude Estimator . . . . . . . . . . . . . . 68 a priori SIR Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.4.3 3.5 Multi-microphone Spectral Enhancement . . . . . . . . . . . . . . . . . . . . . . 71 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5.1 Two Multi-microphone Systems . . . . . . . . . . . . . . . . . . . . . . 72 3.5.2 Speech Presence Probability Estimator . . . . . . . . . . . . . . . . . 75 3.5.3 Late Reverberant Spectral Variance Estimator . . . . . . . . . . . . . . . . . . 77 Estimating Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Reverberation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.7.1 3.7.2 Direct-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Using One Microphone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.8.1 3.8.2 Using Multiple Microphones . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.9 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.6 3.7 3.4 3.8
分享到:
收藏