论文部分内容阅读
为了帮助发音困难者障碍者和外语学习者矫正普通话发音错误,提出基于Mel频率倒谱系数(Mel frequencycepstrum coefficient,MFCC)特征比较和模拟退火-遗传算法(simulated annealing genetic algorithm,SAGA)的普通话音素评分模型。该模型采用动态时间弯折(dynamic timewarping,DTW)算法对普通话音素进行相似度比对,并基于SAGA评分机制对发音进行自动评分。本文对比了不同优化算法(SAGA和局部优化算法)、不同DTW算法对语音评分的影响。结果发现:SAGA评分模型下的音素评分正确率大于94%,远远优于局部优化算法。此外,在SAGA评分模型下,搜索路径为平行四边形的改进DTW算法具有最优的评分结果。因此,基于MFCC和SAGA的评分模型适用于普通话音素评分。
In order to help maladjustment disorder and foreign language learners correct mandarin pronunciation errors, a Mandarin phoneme score based on Mel frequency cepstrum coefficient (MFCC) feature comparison and simulated annealing genetic algorithm (SAGA) model. This model uses the dynamic time warping (DTW) algorithm to compare the similarity of Mandarin phonemes and automatically scores the pronunciation based on the SAGA scoring mechanism. This article compares the different optimization algorithms (SAGA and local optimization algorithm), different DTW algorithm on voice scoring. The result shows that the correct rate of phoneme score under SAGA score model is more than 94%, which is much better than the local optimization algorithm. In addition, under the SAGA scoring model, the improved DTW algorithm with a search path of parallelogram has the best scoring result. Therefore, the scoring model based on MFCC and SAGA is suitable for Mandarin phoneme scores.