论文部分内容阅读
综合了语音识别中常用的高斯混合模型和人工神经网络框架优点的Tandem特征提取方法应用于维吾尔语声学模型训练中,经过一系列后续处理,将原始的MFCC特征转化为Tandem特征,以此作为基于隐马尔可夫统计模型的语音识别系统的输入,并使用最小音素错误区分性训练准则训练声学模型,进而完成在测试集上的识别实验。实验结果显示,Tandem区分性训练方法使识别系统的单词错误率比原先的基于最大似然估计准则的系统相对减少13%。
The tandem feature extraction method which combines the advantages of Gaussian mixture model and artificial neural network frame, which is commonly used in speech recognition, is applied to Uyghur language acoustic training. After a series of subsequent processing, the original MFCC feature is transformed into Tandem feature, Hidden Markov statistical model speech recognition system input, and use the smallest phoneme distinction training criterion training acoustic model, and then complete the recognition test on the test set. Experimental results show that Tandem discriminative training reduces the word error rate of recognition system by 13% relative to the original system based on maximum likelihood estimation.