论文部分内容阅读
实际环境下,一个说话人识别系统的性能受到很多因素的影响,说话人自身发音方式的变化所引起的训练与识别语音的不匹配是其中很重要的一个方面。该文以一个含有多种发音方式变化的数据库为基础,对于不限定发音方式变化类型的情形,在分数域提出了一系列发音方式分数规整(S-Norm)的解决方法。实验结果表明:SZ-Norm、ST-Norm及SZT-Norm的做法均使系统的整体性能在基线基础上有了明显提高,尤其是在SZT-Norm的情况下等错误率下降约为27%,这说明基于分数规整的方法是有效的。
In practice, the performance of a speaker recognition system is affected by many factors. One of the most important aspects is the mismatch between training and recognition speech caused by the speaker’s own pronunciation changes. Based on a database containing many variations of pronunciation styles, this paper proposes a series of solutions to the pronunciation-style fractional rules (S-Norm) in the fractional domain without changing the types of variations of pronunciation styles. The experimental results show that the performance of SZ-Norm, ST-Norm and SZT-Norm are significantly improved on the basis of the baseline. Especially in the case of SZT-Norm, the error rate decreases by about 27% This shows that fractional-based method is effective.