论文部分内容阅读
重叠语音是影响说话人分割性能的主要因素之一。该文提出了基于语音高层信息特征的重叠语音检测方法以提高说话人分割效果。首先用通用背景模型(universal background model,UBM)提取语音的语言学高层信息特征,并融合这些特征和Mel频率倒谱系数(Mel frequency cepstral coefficient,MFCC)特征建立隐Markov模型(hidden Markov model,HMM)检测重叠语音,然后对处理后的语音进行说话人分割。实验结果表明:对于由TIMIT语音库生成的数据集,该方法对重叠语音检测的错误率比单一采用MFCC特征有显著降低,而且说话人分割性能有明显的提高。
Overlapping speech is one of the main factors that affect speaker segmentation performance. This paper proposes an overlapping speech detection method based on the features of speech high-level information to improve the speaker segmentation. At first, we use the universal background model (UBM) to extract the linguistic high-level information features of speech and fuse these features with the Mel frequency cepstral coefficient (MFCC) to build the hidden Markov model (HMM ) Detects overlapping voices and then performs speaker segmentation on the processed voices. The experimental results show that the error rate of overlapping speech detection for the data set generated by the TIMIT speech library is significantly lower than that of the single MFCC feature, and the speaker segmentation performance is obviously improved.