基于掌上电脑录音的说话人识别

来源 :云南大学 | 被引量 : 0次 | 上传用户:Fllyy
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
该文以掌上电脑录音的语音数据库为对象进行与文本无关的说话人识别实验,通过对识别算法的改进,探索有效消除强噪声所带来的影响,提高说话人识别系统的鲁棒性、稳定性与准确性的方法.1、在说话人辨认系统中,以LPC倒谱参数为特征矢量来描述说话人语音和个性特征,用高斯混合模型(GMM,Gaussian Mixture Model)描述特征矢量的分布,利用EM算法估计高斯混合模型(GM)参数,比较了特征参数的协方差矩阵取为不同类型时,EM算法估计高斯混合模型参数的有效性及其对系统识别率的影响.实验表明,当特征矢量协方差矩阵取为满矩阵时,可以对模型参数进行更有效估计,从而有效抵消噪声影响,提高说话人辨认的正确识别率.2、在说话人确认系统中,用LPC倒谱参数为特征矢量表征说话人语音和个性特征,采取两种不同的识别算法进行实验.1)考虑在不同的环境及声道特性下,语音特征具有一定的非平稳性,因此使用假冒说话人集标准化法,首先计算被测试者的标准化似然分值,再根据所给定的门限阈值,由说话人确认系统作出是与否的判决,以此研究与文本无关的说话人确认系统的有效性.2)摒取庞大的假冒说话人背景集,以假设检验理论为基础,在训练阶段由说话人确认系统依据训练集自行确定最佳判决阈值.在此基础上,研究在强噪声背景下,如何实现与文本无关的说话人确认系统的低错误拒绝率(FRR,False Rejection Ratio)和低错误接受率(FAR,False Acceptance Ratio).实验结果显示,以假设检验分析为基础的识别算法比使用假冒说话人集标准化法的识别算法更加有效,两者相比,使用基于假设检验的识别算法,说话人确认系统可以同时获得低错误拒绝率(FRR)和低错误接受率(FAR).有效提高系统的抗噪声性能.该文所述的方法均是在Microsoft Windows 2000和Microsoft Windows ME平台下使用Matlab 6.1编程实现,除估算高斯混合模型(GMM)参数的EM算法外,文中所用的算法均为作者在导师的指导下使用Matlab 6.1编写.关键词:说话人识别 线性预测分析 高斯混合模型 EM算法假冒说话人集 假设检验Abstract Speaker recognition (SR) is a method of identifying and verifying someoneidentity by his speech. It is an important part of speech signal processing. SR isapplied in many realm, including voice dialing, telephone bank, telephone shopping,database speech accessing, information service, speech E-mail, safety control and computer long-distance login. The hand-held PC (PDA, Personal Digital Assistant) is a convenient and movable high-tech electronic device. Its functions include information inputting,accessing, managing and delivering. In addition, it has general computer function,business affair handling, amusement and movable communication. Due to the limitation of PDA volume and hardware configuration, there is much strong different noise mixing among recorded speech or voice compared with general computer or special recording devices. Therefore, a challenging problem is how to remove the strong noise among speech when we process speech or voice recorded by PDA. In this paper, text independent speaker recognition experiment is performed based on PDA speech database. By improving recognition algorithm, we try to find a way to eliminate the strong noise and to improve speaker recognition system robustness, stability and veracity. Firstly, LPC cepstmm feature vector parameters and GMM are adopted in speaker identification system. GMM parameters are estimated by EM algorithm. And its recognition results are compared by adopting different feature parameter covariance matrix. The results indicate when the type of feature parameter covariance matrix is full, GMM parameters estimation is more effective. It can eliminate more noise influence and improve speaker recognition ratio. Secondly, LPC cepstrum feature vector parameters and GMM are adopted in speaker verification system. Two recognition algorithms are used in the experiments.The first method uses impostor speaker sets normalization, considering different environment, acoustic track feature and speech random speciality. To begin with, a judgment of speaker verification system is made, according to testers normalized probability value and given threshold value. And followed with speaker verification.The second method is based on hypothesis test theory. In training phase, the system produce an optimal threshold value automatically. Then we further improve system to achieve low false rejection ratio and low false acceptance ratio in the strong noise background. The experiments indicate the latter recognition method is more effective than the former. Moreover low false rejection ratio and low false acceptance ratio are achieved at the same time by the second method.All experiments discussed in this paper performed by Matlab 6.1 toolkit under Microsoft Windows 2000 and Microsoft Windows ME environment. Except EM algorithm, other algorithms in the paper are programmed by matlab 6.1 toolkit with the tutors direction.Keywords:Speaker recognition, Linear predictive analysis, GMM,EM algorithm, Impostor speakers sets, Hypothesis test
其他文献
超宽带无线通信(UWB)有两大方案:脉冲无线电(IR-UWB)和多载波(MC)调制,多载波方式中以多带OFDM(MB-OFDM)方案广受青睐。MB-OFDM作为一种可以有效对抗多径效应所引起的频率选
该论文首先介绍交大科技园数字化园区总体规划和电子政务系统的设计,重点介绍了交大科技园数字化园的总体规划方案,并对网络平台和综合信息平台的设计做了详尽的讨论,对电子
在CDMA系统里,信号经过频率选择性信道会产生多址干扰,这种干扰当系统用户数较少的时候可以利用RAKE接收机来抑制.然而,当系统用户数增加时,理论估计和仿真结果都表明RAKE接
该文主要以USB接口密码加速器为背景,研究加密服务提供模块的实现方法,使得USB接口密码加速器能直接通过CryptoAPI来访问和操作.该文首先分析了实现CSP所需的标准和规范.依据
随着现代通信技术的发展,软件无线电技术在通信系统中起到越来越重要的作用。数字中频技术作为软件无线电中的关键技术被广泛的研究并应用于现代通信系统中。数字调制技术是
核苷酸的相关性分析是DNA序列分析中的一个重要内容.它反映了生物在亿万年进化过程中在核酸序列中留下的痕迹.因此分析核苷酸的关联性不仅可能有助于理解生物进化的规律,而且
该文基于程序状态的关系模型,描述了一个面向对象的程序设计语言.这个语言把异常处理作为其重要的组成部分,并将其与面向对象的其他的主要特征,如对象、动态类型、数据访问控
本文研究的内容属于雷达抗电子干扰的范畴。结合某远程警戒雷达研究雷达反通信干扰。该雷达为线性调频脉冲压缩体制,带宽B=0.8MHz,时宽T=400μs。工作频率在137-165MHz之间,共8
该文针对适合移动通信终端中使用的中高词汇量孤立词语音识别技术及其实现进行了研究.采用基于隐马尔可夫模型的孤立词语音识别算法,对语音信号进行预处理、端点检测、加窗、
该论文详细研究了新的IEEEStandard802.16-2001协议的MAC层和部分PHY层技术特点.协议规定,下行链路数据流仅由接入点(BS)按时分复用,不存在竞争.上行链路采用时分多址接入TDM