论文部分内容阅读
提出一种根据基音提取的频率分辨率确定自适应窗口的改进Parzen窗方法估计基音概率密度,兼顾了基音统计分布模型在低频段的高分辨率和高频段的平滑;提出利用不同性别的基音分布规律的性别区分算法,对于长句可以达到98%的识别率;通过分析基音均值、方差、统计分布模型在性别上的差异,对基音参数进行基于性别差异的规整;引入规整后的基音均值和基音方差,以及基音统计分布模型距离作为情感特征参数;最后利用K最近邻方法对汉语情感语料进行识别。利用常规方法提取的参数最后得到的识别率为73.8%,而使用经过性别差异规整的基音参数和基音统计分布距离的识别率提高到 81%。
An improved Parzen window method to determine the adaptive window based on the frequency resolution of pitch extraction is proposed to estimate the pitch probability density, which takes into account the high resolution of the pitch distribution model and the smoothing of the high frequency band. The regular gender-discriminating algorithm can achieve a recognition rate of 98% for long sentences. Based on gender differences in pitch mean, variance and statistical distribution model, gender-based warping of pitch parameters is introduced. Pitch variance and distance distribution of pitch statistical distribution model as emotion parameters. Finally, K-nearest neighbor method is used to identify Chinese emotional corpus. The final recognition rate of the parameters extracted by the conventional method is 73.8%, while the recognition rate of the pitch parameter and the pitch distribution using the gender difference regularity is increased to 81%.