论文部分内容阅读
本文探讨了一种特定人的汉语全音节语音识别方案,介绍了一种基于人耳听觉特性的语音参数的提取方法,对以1/3倍频程分布的16个通道滤波器组的对数能量输出用非线性时域归正方法归正到定长,然后求出相邻通道间频谱的变化量,即得到一组新的特征参数——频变参数.这组参数能够较好地反映语音中与感知有关的特性,如高音、音强、音调等.音节被选用来作为识别的基本单位,以400个汉语无调音节作为字表.最后给出了识别结果.
This paper discusses a Chinese-specific syllable speech recognition scheme based on human speech, introduces a speech parameter extraction method based on the human auditory characteristics, the logarithm of the 16-channel filter bank The energy output is corrected to a fixed length by a non-linear time-domain correction method, and then the variation of the frequency spectrum between adjacent channels is obtained to obtain a new set of characteristic parameters - frequency-varying parameters. This set of parameters can better reflect Sensory-related features in speech, such as treble, pitch, pitch, etc. Syllables were chosen as the basic units of recognition, using 400 Chinese toning-free as word lists. Finally, the recognition results were given.