Speech Dereverberation.pdf

发布时间：2022-06-08 发布人：admin 分类：说明书资料大小：11.05M 资料格式：pdf 举报版权申诉

voipp2p-8366211-4744302542965266746.pdf-第1页.png

第1页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第2页.png

第2页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第3页.png

第3页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第4页.png

第4页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第5页.png

第5页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第6页.png

第6页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第7页.png

第7页 / 共396页

voipp2p-8366211-4744302542965266746.pdf-第8页.png

第8页 / 共396页

FrontMatter.pdf

Preface

Contents

List of Contributors

chapter1 Introduction.pdf

1 Introduction

1.1 Background

1.2 Effects of Reverberation

1.3 Speech Acquisition

1.4 System Description

1.5 Acoustic Impulse Responses

1.6 Literature Overview

1.6.1 Beamforming Using Microphone Arrays

1.6.2 Speech Enhancement Approaches to Dereverberation

1.6.3 Blind System Identification and Inversion

1.6.3.1 Blind System Identification

1.6.3.2 Inverse Filtering

1.7 Outline of the Book

References

chapter2 Models, Measurement and Evaluation.pdf

2 Models, Measurement and Evaluation

2.1 An Overview of Room Acoustics

2.1.1 The Wave Equation

2.1.2 Sound Field in a Reverberant Room

2.1.3 Reverberation Time

2.1.4 The Critical Distance

2.1.5 Analysis of Room Acoustics Dependent on Frequency Range

2.2 Models of Room Reverberation

2.2.1 Intuitive Model

2.2.2 Finite Element Models

2.2.3 Digital Waveguide Mesh

2.2.4 Ray-tracing

2.2.5 Source-image Model

2.2.6 Statistical Room Acoustics

2.3 Subjective Evaluation

2.4 Channel-based Objective Measures

2.4.1 Normalized Projection Misalignment

2.4.2 Direct-to-reverberant Ratio

2.4.3 Early-to-total Sound Energy Ratio

2.4.4 Early-to-late Reverberation Ratio

2.5 Signal-based Objective Measures

2.5.1 Log Spectral Distortion

2.5.2 Bark Spectral Distortion

2.5.3 Reverberation Decay Tail

2.5.4 Signal-to-reverberant Ratio

2.5.4.1 Relationship Between DRR and SRR

2.5.4.2 Level Normalization in SRR

2.5.4.3 SRR Computation Example

2.5.4.4 SRR Summary

2.5.5 Experimental Comparisons

2.6 Dereverberation Performance of the Delay-and-sum Beamformer

2.6.1 Simulation Results: DSB Performance

Experiment 1: Effect of Source-microphone Distance

Experiment 2: Effect of Number of Microphones

2.7 Summary and Discussion

References

chapter3 Speech Dereverberation Using Statistical Reverberation Models.pdf

3 Speech Dereverberation Using Statistical Reverberation Models

3.1 Introduction

3.2 Review of Dereverberation Methods

3.2.1 Reverberation Cancellation

3.2.2 Reverberation Suppression

3.3 Statistical Reverberation Models

3.3.1 Polack’s Statistical Model

3.3.2 Generalized Statistical Model

3.4 Single-microphone Spectral Enhancement

3.4.1 Problem Formulation

3.4.2 MMSE Log-spectral Amplitude Estimator

3.4.3 a priori SIR Estimator

3.5 Multi-microphone Spectral Enhancement

3.5.1 Problem Formulation

3.5.2 Two Multi-microphone Systems

3.5.2.1 MVDR Beamformer and Single-channel MMSE Estimator

3.5.2.2 Non-linear Spatial Processor

3.5.3 Speech Presence Probability Estimator

3.6 Late Reverberant Spectral Variance Estimator

3.7 Estimating Model Parameters

3.7.1 Reverberation Time

3.7.2 Direct-to-reverberant Ratio

3.8 Experimental Results

3.8.1 Using One Microphone

3.8.2 Using Multiple Microphones

3.9 Summary and Outlook

Acknowledgment

References

chapter4 Dereverberation Using LPC-based Approaches.pdf

4 Dereverberation Using LPC-based Approaches

4.1 Introduction

4.2 Linear Predictive Coding of Speech

4.3 LPC on Reverberant Speech

4.3.1 Effects of Reverberation on the LPC Coefficients

4.3.1.1 Single Microphone

4.3.1.2 JointMultichannel Optimization

4.3.1.3 LPC at the Output of a Delay-and-sum Beamformer

4.3.2 Effects of Reverberation on the Prediction Residual

4.3.3 Simulation Examples for LPC on Reverberant Speech

4.4 Dereverberation Employing LPC

4.4.1 Regional Weighting Function

4.4.2 Weighting Function Based on Hilbert Envelopes

4.4.3 Wavelet Extrema Clustering

4.4.4 Weight Function from Coarse Channel Estimates

4.4.5 Kurtosis Maximizing Adaptive Filter

4.5 Spatiotemporal Averaging Method for Enhancement of Reverberant Speech

4.5.1 Larynx Cycle Segmentation with Multichannel DYPSA

4.5.2 Time Delay of Arrival Estimation for Spatial Averaging

4.5.3 Voiced/Unvoiced/Silence Detection

4.5.4 Weighted Inter-cycle Averaging

4.5.5 Dereverberation Results

4.6 Summary

Appendix A

References

chapter5 Multi-microphone Speech Dereverberation Using Eigen-decomposition.pdf

5 Multi-microphone Speech Dereverberation Using Eigen-decomposition

5.1 Introduction

5.2 Problem Formulation

5.3 Preliminaries

5.4 AIR Estimation – Algorithm Derivation

5.5 Extensions of the Basic Algorithm

5.5.1 Two-microphone Noisy Case

5.5.1.1 White Noise Case

5.5.1.2 Colored Noise Case

5.5.2 Multi-microphone Case (M > 2)

5.5.3 Partial Knowledge of the Null Subspace

5.6 AIR Estimation in Subbands

5.7 Signal Reconstruction

5.8 Experimental Study

5.8.1 Full-band Version – Results

5.8.2 Subband Version – Results

5.9 Limitations of the Proposed Algorithms and Possible Remedies

5.9.1 Noise Robustness

5.9.2 Computational Complexity and Memory Requirements

5.9.3 Common Zeros

5.9.4 The Demand for the Entire AIR Compensation

5.9.5 Filter-bank Design

5.9.6 Gain Ambiguity

5.10 Summary and Conclusions

References

chapter 6 Adaptive Blind Multichannel System Identification.pdf

6 Adaptive Blind Multichannel System Identification

6.1 Introduction

6.2 Problem Formulation

6.2.1 Channel Identifiability Conditions

6.3 Review of Adaptive Algorithms for Acoustic BSI Employing Cross-relations

6.3.1 The Multichannel Least Mean Squares Algorithm

6.3.2 The Normalized Multichannel Frequency Domain LMS Algorithm

6.3.3 The Improved Proportionate NMCFLMS Algorithm

6.4 Effect of Noise on the NMCFLMS Algorithm – The Misconvergence Problem

6.5 The Constraint Based ext-NMCFLMS Algorithm

6.5.1 Effect of Noise on the Cost Function

6.5.2 Penalty Term Using the Direct-path Constraint

6.5.3 Delay Estimation

6.5.4 Flattening Point Estimation

6.6 Simulation Results

6.6.1 Experimental Setup

6.6.2 Variation of Convergence rate on β

6.6.3 Degradation Due to Direct-path Estimation

6.6.4 Comparison of Algorithm Performance Using a WGN Input Signal

6.6.5 Comparison of Algorithm Performance Using Speech Input Signals

6.7 Conclusions

References

chapter 7 Subband Inversion of Multichannel Acoustic Systems.pdf

7 Subband Inversion of Multichannel Acoustic Systems

7.1 Introduction

7.2 Multichannel Equalization

7.3 Equalization with Inexact Impulse Responses

7.3.1 Effects of System Mismatch

7.3.2 Effects of System Length

7.4 Subband Multichannel Equalization

7.4.1 Oversampled Filter-banks

7.4.2 Subband Decomposition

7.4.3 Subband Multichannel Equalization

7.5 Computational Complexity

7.6 Application to Speech Dereverberation

7.7 Simulations and Results

7.7.1 Experiment 1: Complex Subband Decomposition

7.7.2 Experiment 2: Random Channels

7.7.3 Experiment 3: Simulated Room Impulse Responses

7.7.4 Experiment 4: Speech Dereverberation

7.8 Summary

References

chapter8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker.pdf

8 Bayesian Single Channel Blind Dereverberation of Speech from a Moving Talker

8.1 Introduction and Overview

8.1.1 Model-based Framework

8.1.1.1 Online vs. Offline Numerical Methods

8.1.1.2 Parametric Estimation and Optimal Filtering methods

8.1.2 Practical Blind Dereverberation Scenarios

8.1.2.1 Single-sensor Applications

8.1.2.2 Time-varying Acoustic Channels

8.1.3 Chapter Organisation

8.2 Mathematical Problem Formulation

8.2.1 Bayesian Framework for Blind Dereverberation

8.2.2 Classification of Blind Dereverberation Formulations

8.2.3 Numerical Bayesian Methods

8.2.3.1 Markov Chain Monte Carlo

8.2.3.2 Sequential Monte Carlo

8.2.3.3 General Comments

8.2.4 Identifiability

8.3 Nature of Room Acoustics

8.3.1 Regions of the Audible Spectrum

8.3.2 The Room Transfer Function

8.3.3 Issues with Modelling Room Transfer Functions

Long and Non-minimum Phase AIRs

Robustness to Estimation Error and Variation of Inverse of the AIR

Subband and Frequency-zooming Solu

8.4 Parametric Channel Models

8.4.1 Pole-zero and All-zero Models

8.4.2 The Common-acoustical Pole and Zero Model

8.4.3 The All-pole Model

8.4.4 Subband All-pole Modelling

8.4.5 The Nature of Time-varying All-pole Models

8.4.6 Static Modelling of TVAP Parameters

8.4.7 Stochastic Modelling of Acoustic Channels

8.5 Noise and System Model

8.6 Source Model

8.6.1 Speech Production

8.6.2 Time-varying AR Modelling of Unvoiced Speech

8.6.2.1 Statistical Nature of Speech Parameter Variation

8.6.3 Static Block-based Modelling of TVAR Parameters

8.6.3.1 Basis Function Representation

8.6.3.2 Choice of Basis Functions

8.6.3.3 Block-based Time-varying Approach

8.6.4 Stochastic Modelling of TVAR Parameters

8.7 Bayesian Blind Dereverberation Algorithms

8.7.1 Offline Processing Using MCMC

8.7.1.1 Likelihood for Source Signal

8.7.1.2 Complete Likelihood for Observations

8.7.1.3 Prior Distributions of Source, Channel and Error Residual

8.7.1.4 Posterior Distribution of the Channel Parameters

8.7.1.5 Experimental Results

8.7.2 Online Processing Using Sequential Monte Carlo

8.7.2.1 Source and Channel Model

8.7.2.2 Conditionally Gaussian State Space

8.7.2.3 Methodology

8.7.2.4 Channel Estimation Using Bayesian Channel Updates

8.7.2.5 Experimental Results

8.7.3 Comparison of Offline and Online Approaches

8.8 Conclusions

References

chapter 9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information.pdf

9 Inverse Filtering for Speech Dereverberation Without the Use of Room Acoustics Information

9.1 Introduction

9.2 Inverse Filtering for Speech Dereverberation

9.2.1 Speech Capture Model with Multiple Microphones

9.2.2 Optimal Inverse Filtering

9.2.3 Unsupervised Algorithm to Approximate Optimal Processing

9.3 Approaches to Solving the Over-whitening of the Recovered Speech

9.3.1 Precise Compensation for Over-whitening of Target Speech

9.3.1.1 Principle

9.3.1.2 Close to Perfect Dereverberation

9.3.1.3 Dereverberation and Coherent Noise Reduction

9.3.1.4 Sensitivity to Incoherent N

9.3.2 Late Reflection Removal with Multichannel Multistep LP

9.3.2.1 Principle

9.3.2.2 Speech Dereverberation Performance in Terms of ASR Score

9.3.2.3 Speech Dereverberation in a Noisy Environment

9.3.2.4 Dereverberation of Multiple Sound Source Signals

9.3.3 Joint Estimation of Linear Predictors and Short-time Speech Characteristics

9.3.3.1 Background

9.3.3.2 Principle

9.3.3.3 Algorithms

9.3.4 Probabilistic Model Based Speech Dereverberation

9.3.4.1 Probabilistic Speech Model

9.3.4.2 Likelihood Function for Multichannel LP

9.3.4.3 Autocorrelation Codebook-based Speech Dereverberation

9.4 Concluding Remarks

Appendix A

References

chapter 10 TRINICON for Dereverberation of Speech and Audio Signals.pdf

10 TRINICON for Dereverberation of Speech and Audio Signals

10.1 Introduction

10.1.1 Generic Tasks for Blind Adaptive MIMO Filtering

10.1.2 A Compact Matrix Formulation for MIMO Filtering Problems

10.1.3 Overview of this Chapter

10.2 Ideal Inversion Solution and the Direct-inverse Approach to Blind Deconvolution

10.3 Ideal Solution of Direct Adaptive Filtering Problems and the Identification-and-inversion Approach to Blind Deconvolution

10.3.1 Ideal Separation Solution for Two Sources and Two Sensors

10.3.2 Relation to MIMO and SIMO System Identification

10.3.3 Ideal Separation Solution and Optimum Separation Filter Length for an Arbitrary Number of Sources and Sensors

10.3.4 General Scheme for Blind System Identification

10.3.5 Application of Blind System Identification to Blind Deconvolution

10.4 TRINICON – A General Framework for Adaptive MIMO Signal Processing and Application to Blind Adaptation Problems

10.4.1 Matrix Notation for Convolutive Mixtures

10.4.2 Optimization Criterion

10.4.3 Gradient-based Coefficient Update

10.4.3.1 Alternative Formulation of the Gradient-based Coefficient Update

10.4.4 Natural Gradient-based Coefficient Update

10.4.5 Incorporation of Stochastic Source Models

10.4.5.1 Spherically Invariant Random Processes as Signal Model

10.4.5.2 Multivariate Gaussians as Signal Model: Second-order Statistics

10.4.5.3 Nearly Gaussian Densities as Signal Model

10.5 Application of TRINICON to Blind System Identification and the Identification-and-inversion Approach to Blind Deconvolution

10.5.1 Generic Gradient-based Algorithm for Direct Adaptive Filtering Problems

10.5.1.1 Illustration for Second-order Statistics

10.5.2 Realizations for the SIMO Case

10.5.2.1 Coefficient Initialization

10.5.2.2 Efficient Implementation of the Sylvester Constraint for the Special Case of SIMO Models

10.5.3 Efficient Frequency-domain Realizations for the MIMO Case

10.6 Application of TRINICON to the Direct-inverse Approach to Blind Deconvolution

10.6.1 Multichannel Blind Deconvolution

10.6.2 Multichannel Blind Partial Deconvolution

10.6.3 Special Cases and Links to Known Algoritms

10.6.3.1 SIMO vs. MIMO Mixing Systems

10.6.3.2 Efficient Implementation Using the CorrelationMethod

10.6.3.3 Relations to Some Known HOS Approaches

10.6.3.4 Relations to Some Known SOS Approaches

10.7 Experiments

10.7.1 The SIMO Case

10.7.2 The MIMO Case

10.8 Conclusions

Appendix A: Compact Derivation of the Gradient-based Coefficient Update

Appendix B: Transformation of the Multivariate Output Signal PDF in (10.39) by Blockwise Sylvester Matrix

Appendix C: Polynomial Expansions for Nearly Gaussian Probability Densities

Appendix D: Expansion of the Sylvester Constraints in (10.83)

References

Signals and Communication Technology

Patrick A. Naylor · Nikolay D. Gaubitch Editors Speech Dereverberation 123

Patrick A. Naylor, PhD Nikolay D. Gaubitch, PhD Department of Electrical and Electronic Engineering Imperial College London Exhibition Road London SW7 2AZ United Kingdom p.naylor@imperial.ac.uk ndg@imperial.ac.uk e-ISBN 978-1-84996-056-4 ISSN 1860-4862 ISBN 978-1-84996-055-7 DOI 10.1007/978-1-84996-056-4 Springer London Dordrecht Heidelberg New York British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: 2010930018 © Springer-Verlag London Limited 2010 Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be re- produced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the infor- mation contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Cover design: WMXDesign, Heidelberg, Germany Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

This book owes its existence to the numerous students and co-workers who, over the years, have worked with me and inspired me. I gratefully acknowledge their contributions. I dedicate this book to my wife Catharine who has taught me so much. To my mother. Patrick A. Naylor Nikolay D. Gaubitch

Preface Speech dereverberation has been on the agenda of the signal processing community for several years. It is only in the last decade, however, that the topic has really taken off, as seen from the growing number of publications appearing in the journals and at conferences. One of the reasons that the topic has become more popular is the rapidly growing availability in the marketplace of computationally capable mobile devices, such as phones, PDAs and laptop computers, for which hands-free (distant talking) operation is desirable. This is all the more signiﬁcant when seen in the context of the conﬂuence of computing and communication terminals exploiting low-cost VoIP-enabled telephony applications. Additionally, it is also true to say that user expectations of computing and communication devices is a strongly increasing function with time, perhaps only moderated by considerations of value versus cost – people are more forgiving of technology limitations if they are not paying (much) for the service they are employing. Factors such as these have combined to motivate the signal processing community to provide robust solutions for speech enhancement in general and to work on in particular, what for many is a new task, dereverberation. Since we began our research in this ﬁeld, we have been receiving inquiries from curious researchers seeking a digestible review on the state-of-the-art in the ﬁeld of speech dereverberation. Until now, the answer has always been that, although there have been several books that treat the subject of speech processing, microphone ar- ray processing, and audio processing, which have included chapters on speech dere- verberation, there has not been a publication that gives a comprehensive overview of the topic. We believe that the ﬁeld has now reached a maturity that allows the compilation of such a book, solely dedicated to the topic of speech dereverberation. It was this belief and the context of the situation that motivated our initiative in this writing project. Before you decide to skip the rest of this Preface on the grounds that its au- thors have lost their grip on reality, let us momentarily clarify the level of matu- rity to which we are referring. The three main axes of speech enhancement were highlighted by Walter Kellermann at the 1999 International Workshop on Acoustic Echo and Noise Control to be echo cancellation, noise reduction and dereverbera- tion. Of these three it is a likely consensus that dereverberation is the more difﬁcult vii

viii Preface task. Modelling of room acoustics is more complicated than either the modelling of speech production or of noise generation processes and their additive combina- tion with speech signals, at least in the manner in which such models are currently applied in DSP algorithm development. Dereverberation is also normally formu- lated as a blind (or unsupervised) problem, somewhat related to, but nevertheless distinct from, blind source separation. Computational limitations both in power and precision also present real challenges in this ﬁeld. So, it is inevitable that, given the difﬁculty of the problem and the fact that attention on this problem has not been strongly focused for as long as it has on either echo cancellation or noise reduction, the level of maturity in the understanding of the dereverberation problem and its solutions is far below that of the other related problems. At this stage of dereverber- ation technology, we could argue that there are more open questions than solutions; those solutions that are available strive towards, but do not always achieve, the levels of robustness found in many of the more mature technologies. This book, therefore, by no means offers any ultimate solutions to the speech dereverberation problem. Nonetheless, it aims to provide an in-depth overview of the state-of-the-art in speech dereverberation methods with contributing chapters from some of the key researchers in the ﬁeld. It also gives what we believe to be a valuable introduction to some topics relevant to dereverberation, such as room acoustics and psychoacoustics, though we have not aimed to cover these subjects in detail. The book is aimed at researchers and graduate students who would like to pursue research in this ﬁeld, giving an accessible introduction to the topic including nu- merous references to other publications. It could make an excellent complementary text for postgraduate courses in speech processing. However, it is not exclusively limited to this group of readers. The attempt to solve the very difﬁcult problem of speech dereverberation has involved a large variety of signal processing tools. Such tools include multirate signal processing, adaptive ﬁltering, Bayesian inference, and linear prediction, to mention but a few. The applications of these techniques can be found useful in other ﬁelds of engineering. We would like to thank all the contributing authors for their excellent chapters. We would also like to express our special thanks to Dr. Emanu¨el Habets for his care- fully reading of our drafts and for his helpful discussions and contributions through- out this project. London, February 2009 Patrick A. Naylor Nikolay D. Gaubitch

Contents 1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Patrick A. Naylor and Nikolay D. Gaubitch 1.1 1.2 1.3 1.4 1.5 1.6 1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Effects of Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Speech Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 System Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Acoustic Impulse Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Literature Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beamforming Using Microphone Arrays . . . . . . . . . . . . . . . 1.6.1 8 Speech Enhancement Approaches to Dereverberation . . . . 10 1.6.2 1.6.3 Blind System Identiﬁcation and Inversion . . . . . . . . . . . . . . 11 1.7 Outline of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Models, Measurement and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Patrick A. Naylor, Emanu¨el A.P. Habets, Jimi Y.-C. Wen, and Nikolay D. Gaubitch 2.1 An Overview of Room Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.1.1 2.1.2 Sound Field in a Reverberant Room . . . . . . . . . . . . . . . . . . . 23 Reverberation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.3 The Critical Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.1.4 Analysis of Room Acoustics Dependent on Frequency 2.1.5 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2 Models of Room Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Intuitive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Finite Element Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Digital Waveguide Mesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Ray-tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Source-image Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Statistical Room Acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.2.6 ix

x 3 Contents 2.5 2.3 2.4 Subjective Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Channel-based Objective Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Normalized Projection Misalignment . . . . . . . . . . . . . . . . . . 37 2.4.1 Direct-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.4.2 2.4.3 Early-to-total Sound Energy Ratio . . . . . . . . . . . . . . . . . . . . 38 2.4.4 Early-to-late Reverberation Ratio . . . . . . . . . . . . . . . . . . . . . 39 Signal-based Objective Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Log Spectral Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.5.1 2.5.2 Bark Spectral Distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Reverberation Decay Tail . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.5.3 Signal-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.5.4 2.5.5 Experimental Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Dereverberation Performance of the Delay-and-sum Beamformer . . 50 2.6.1 Simulation Results: DSB Performance . . . . . . . . . . . . . . . . . 51 2.7 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.6 3.3 Speech Dereverberation Using Statistical Reverberation Models . . . . . 57 Emanu¨el A.P. Habets 3.1 3.2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Review of Dereverberation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.1 Reverberation Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.2 Reverberation Suppression . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Statistical Reverberation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Polack’s Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 3.3.2 Generalized Statistical Model . . . . . . . . . . . . . . . . . . . . . . . . 63 Single-microphone Spectral Enhancement . . . . . . . . . . . . . . . . . . . . . 64 3.4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.4.2 MMSE Log-spectral Amplitude Estimator . . . . . . . . . . . . . . 68 a priori SIR Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.4.3 3.5 Multi-microphone Spectral Enhancement . . . . . . . . . . . . . . . . . . . . . . 71 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.5.1 Two Multi-microphone Systems . . . . . . . . . . . . . . . . . . . . . . 72 3.5.2 Speech Presence Probability Estimator . . . . . . . . . . . . . . . . . 75 3.5.3 Late Reverberant Spectral Variance Estimator . . . . . . . . . . . . . . . . . . 77 Estimating Model Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 Reverberation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.7.1 3.7.2 Direct-to-reverberant Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Using One Microphone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 3.8.1 3.8.2 Using Multiple Microphones . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.9 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.6 3.7 3.4 3.8

分享到：

赞收藏

资料库

Speech Dereverberation.pdf

相关推荐

课程资源

热门标签

最新资料