论文部分内容阅读
研究最小方差无失真响应感知倒谱系数在说话人识别中的应用。提取最小方差无失真响应感知倒谱系数,对其进行高斯混合模型建模并采用联合因子分析的方法来拟合高斯混合模型中的说话人和信道差异,在美国国家标准技术研究院2008年说话人识别评测核心测试集上分别对最小方差无失真响应感知倒谱系数和传统的Mel频率倒谱系数进行测试。结果显示,两种不同特征的系统性能相当,采用线性融合方法后,在不同测试集上的等错误率相对下降了7.6%~30.5%,最小检测错误代价相对下降了3.2%~21.2%。实验表明,最小方差无失真响应感知倒谱系数能有效应用于说话人识别中,且与传统的Mel频率倒谱系数存在一定程度的互补性。
Application of Minimum Variance Non-Distortion Response-Cepstral Coefficient in Speaker Recognition. Extract the minimum variance and distortion-free response-perceived cepstrum coefficients, model them with Gaussian mixture model and fit the speaker and channel differences in the Gaussian mixture model by the method of joint factor analysis, and speak at the National Institute of Standards and Technology in 2008 The minimum variance variance-free perceived cepstrum coefficient and the traditional Mel frequency cepstrum coefficient are respectively tested on the human test core set. The results show that the system performance of the two different features is equivalent. After using the linear fusion method, the equal error rate on different test sets decreased by 7.6% ~ 30.5% and the minimum detection error decreased by 3.2% ~ 21.2%. Experiments show that the minimum variance and distortion-free perceptual cepstral coefficients can be effectively used in speaker recognition, and have some degree of complementarity with the traditional Mel frequency cepstrum coefficients.