论文部分内容阅读
基于G auss ian混合模型的音色变换算法在预测目标说话人频谱时会出现过平滑问题,导致声音转换结果的音质下降。该文分析了造成过平滑问题的原因,并提出一种考虑帧间动态特征的音色变换改进算法,在估计参数的目标函数中加入了连续性和方差的影响,从而改善了映射结果的帧间连续性,并使方差最大化,克服了过平滑现象。实验表明该算法在保证变换结果的目标倾向性的同时,能够使变换语音的音质主观意见得分由3.11提高到3.89,证明动态特征对提高音色变换的音质有重要意义。
Based on the G auss ian hybrid model, the sound color transformation algorithm may have a problem of smoothing in predicting the target speaker spectrum, resulting in a decrease in the sound quality of the sound conversion result. This paper analyzes the causes of over-smoothing problems and proposes an improved timbre conversion algorithm that takes into account the dynamic characteristics of the inter-frame. By adding continuity and variance to the objective function of the estimated parameters, the inter-frame mapping results are improved Continuity, maximizing variance and over-smoothing. Experimental results show that the proposed algorithm can improve the subjective opinion score of the transformed speech from 3.11 to 3.89 while ensuring the target orientation of the transformation result. It proves that the dynamic feature is of great significance to improve the sound quality of the speech transform.