论文部分内容阅读
语音识别中,动态时间规整(Dynamic Time Warping,简称DTW)和隐马尔可夫模型(Hidden Markov Model,简称 HMM)是最有效的两种识别算法,并且 DTW和 HMM在本质上是一致的~[1]。根据 DTW和 HMM的本质联系和各自所对应的声学模型,在前期工作中建立了一种广义声学模型 ~[2][3](General Model,简称 GM),并指出 DTW和HMM 只是 GM的特例,且 DTW和 HMM都可以转化为 GM。并在此基础上,首次将 Fisher算法~[4]引进GM的学习算法,确保了GM状态分割的收敛性,并且这种分割在最小离差意义上是全局精确最优的。最后,从大数定理的角度出发,对 GM算法的收敛性进行了分析, 从理论上论证了该算法的依概率收敛性,并为实际应用中 GM算法的有效性提供了理论依据。
In speech recognition, Dynamic Time Warping (DTW) and Hidden Markov Model (HMM) are the most effective two recognition algorithms, and DTW and HMM are essentially the same [ 1]. According to the essential relationship between DTW and HMM and their corresponding acoustic models, a generalized acoustic model ~ [2] [3] (General Model) was established in the previous work and pointed out that DTW and HMM are only special cases of GM , And both DTW and HMM can be converted to GM. Based on this, for the first time Fisher algorithm ~ [4] is introduced into GM learning algorithm to ensure the convergence of GM state segmentation, and this segmentation is globally accurate and optimal in the sense of minimum dispersion. Finally, from the point of view of the theorem of large numbers, the convergence of the GM algorithm is analyzed, the probability convergence of the algorithm is demonstrated theoretically, and the theoretical basis of the validity of the GM algorithm is provided.