语音信号基音检测算法的研究
申请上海交通大学工程硕士学位论文
语音信号基音检测算法的研究
学校代码:
作者姓名:
学 号:
第一导师:
第二导师:
学科专业:
答辩日期: 2011 年 1 月 28 日
10248
陈栋
1070379357
吴刚
周松柏
软件工程
上海交通大学软件学院
2010 年 11 月
万方数据
语音信号基音检测算法的研究
A Dissertation Submitted to Shanghai Jiao Tong University
for Master Degree of Engineering
RESEARCH ON PITCH DETECTION ALGORITHM OF
SPEECH
University Code:
Author:
Student ID:
Mentor 1:
Mentor 2:
Field:
Date of Oral Defense:
10248
陈栋
1070379357
吴刚
周松柏
Software Engineering
2011-1-28
School of Software
Shanghai Jiaotong University
Nov, 2010
万方数据
语音信号基音检测算法的研究
上海交通大学
学位论文原创性声明
本人郑重声明:所呈交的学位论文,是本人在导师的指导下,独立进
行研究工作所取得的成果。除文中已经注明引用的内容外,本论文不包含
任何其他个人或集体已经发表或撰写过的作品成果。对本文的研究做出重
要贡献的个人和集体,均已在文中以明确方式标明。本人完全意识到本声
明的法律结果由本人承担。
学位论文作者签名:
日期: 年 月 日
万方数据
语音信号基音检测算法的研究
上海交通大学
学位论文版权使用授权书
本学位论文作者完全了解学校有关保留、使用学位论文的规定,同意
学校保留并向国家有关部门或机构送交论文的复印件和电子版,允许论文
被查阅和借阅。本人授权上海交通大学可以将本学位论文的全部或部分内
容编入有关数据库进行检索,可以采用影印、缩印或扫描等复制手段保存
和汇编本学位论文。
保密□,在 年解密后适用本授权书。
本学位论文属于
不保密□。
(请在以上方框内打“√”)
学位论文作者签名: 指导教师签名:
日期: 年 月 日 日期: 年 月 日
万方数据
语音信号基音检测算法的研究
语音信号基音检测算法的研究
摘 要
基音周期是语音信号的重要参数之一,准确而快速的基音周期提取对语音信号的合
成、编码、识别等都具有重要的意义。目前,人们已经从语音信号的时域特性、频域特
性和时频混合特性三个方面出发,提出了许多基音检测算法。经典的语音信号基音检测
算法有:自相关函数法(Autocorrelation Function)(简称 ACF)、平均幅度差函数法
(Average Magnitude Difference Function)(简称 AMDF)、倒谱法和小波变换法等。但
由于基音周期本身固有的特性,目前还没有一种能适应任何人、任何应用和任何环境的
基音检测算法。本论文旨在寻找一种准确性和鲁棒性都相对较好的基音检测算法。
首先,本文介绍了语音信号的生成、声学特征、产生的数学模型、频谱特性和短时
特性。其次,对较典型的几种语音基音检测算法,作了较系统的分析、探讨和比较。倒
谱法计算量较大,倒谱的峰值取决的因素较多,常受到模型的带宽限制。小波变换法具
有良好的时频局部分析能力,非常适合于探测正常信号中的突变,但受声道响应、音联、
协同发音和变调规律的影响较大,计算量大。平均幅度差函数法无需乘法运算,因而算
法复杂度小,但当语音信号幅度快速变化时,平均幅度差函数法估计的精度会明显下降。
经典的时域自相关函数基音检测是其中一种性能较好的算法,然而 ACF 算法在无噪声
环境下有时会发生基音倍频和半频错误,在噪声环境下,这种错误的发生率显著增加。
在深入研究小波分析理论和现有基音检测算法的基础上,结合预处理、动态规划平
滑和 Teager 能量算子(Teager Energy Operator)(简称 TEO)等技术,得到了一种新的基
音周期检测算法。新算法的基本思想是:首先对小波系数的“能量”的计算采用 TEO
进行,根据判别门限判断清浊音,然后仅对浊音段提取基音周期;在基音周期提取前端
用低通、数值滤波等方法进行预处理,去除工频干扰、共振峰和高频噪声的影响;在基
音周期提取后端对结果用中值平滑和线性平滑相结合的平滑法进行后处理,去除掉半、
倍频点和随机错误点。
论文给出了新基音周期算法的实现流程,同时对算法的每个阶段进行了详细地阐
述。用 MATLAB 在计算机上完成算法程序设计,进行仿真实验。实验结果表明,新算
法估计基音周期准确性高,运算速度较快、稳定性好,对噪声具有较好的鲁棒性,充分
吸收了自相关算法和小波变换算法的优点,有效地克服了自相关算法的分频和倍频现
象,性能明显优于传统方法。论文的最后还给出了新算法的两个应用实例:基于改进基
万方数据
I
语音信号基音检测算法的研究
音检测算法的方言辨识系统和 PSOLA 语音合成技术。说明了新算法具有较好的应用前
景。
关键词 自相关函数,平滑,小波变换,Teager 能量算子,数值滤波
II
万方数据
语音信号基音检测算法的研究
RESEARCH ON PITCH DETECTION ALGORITHM OF
SPEECH
ABSTRACT
Pitch period is an important parameter in the analysis and synthesis of speech signals.
Pitch period information is used in various applications such as speech signal synthesis,
speech coding and speech recognition. Up to now, many traditional pitch detectors have been
developed in three ways which are time field, frequency field and time-frequency field, such
as: ACF (Autocorrelation Function), AMDF (Average Magnitude Difference Function)), CEP
(Cepstrum), Wavelet Transform and so on. However the task of the estimating pitch is very
difficult because of the pitch's properties. Therefore, no one algorithm that has been
developed so far performs perfectly for different speakers, different applications and different
environmental condition. This paper aims at looking for a kind of pitch detection algorithm
with relatively good accuracy and robustness.
Firstly, in this paper, the generation of speech, its numeric model, its characteristics and
its analyzing methods are introduced. Then, a more systematic analysis and comparison
among some typical methods of pitch detection are made. The autocorrelation detection
method is a time-domain algorithm, which can improve the efficiency of the pitch detection.
But ACF occur pitch multiple-frequency and half-frequency error at times under the
non-noise environment, this rate of the error increase prominent under the background noise.
This paper proposes a robust pitch detection method based on the tradition ACF. A
voiced regions detection (VRD) algorithm based on wavelet transform and Teager energy
operator is proposed firstly, then pitch-up the pitch period only in voiced regions; For taking
out influence from the formant and high frequency noise, using efficient pre-processing some
such as low pass and numerical value filter before the process of the pith detection; After the
process of the pith detection, using linear smoothing and median smoothing method smooth
万方数据
III
语音信号基音检测算法的研究
half, multiply frequency and random error point.
The paper gives the implemented process of new pitch algorithm and the details of each
stage. Finish designing program with MATLAB on the computer and carry on the simulation
experiment. Experiment’s results show that new algorithm has high accuracy in pitch estimate,
high speed in operation, good stability, and strong robustness to the noise. It combines the
advantage of autocorrelation algorithm and wavelet algorithm, overcomes the phenomena of
fractional frequency and double frequency in autocorrelation algorithm, and has better
character of resisting the noise than wavelet algorithm. The performance of our new algorithm
is obviously superior to the traditional algorithm. Finally, the paper gives two examples of the
new pitch algorithm: the dialect identification system and PSOLA speech synthesis
technology based on improved pitch detection algorithm. It’s show that the new algorithm has
fine prospects.
Keywords autocorrelation function, smoothness, wavelet transform,
Teager energy operator, numerical value filter
万方数据
IV