High-quality voice conversion system based on GMM statistical parameters and RBF neural network

来源 :The Journal of China Universities of Posts and Telecommunica | 被引量 : 0次 | 上传用户:lh923
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
A voice conversion(VC) system was designed based on Gaussian mixture model(GMM) and radial basis function(RBF) neural network. As a voice conversion model, RBF network needs quantities of training data to improve its performance. For one speech, the networks trained by different segments of data have different transformation effects. Since trying segment by segment to obtain the best conversion effect is complex, a conversion method was proposed, that uses GMM for statistics before training RBF network to aim at the problem. The speech transformation and representation using adaptive interpolation of weighted spectrum(STRAIGHT) model is used for accurate extraction of vocal tract spectrum. Then GMM is used to classify the numerous spectral parameters. The obtained mean parameters were trained in RBF network. Experiment reveals that, the soft classification ability of GMM can promptly realize the reduction and classification of training data under the premise of ensuring the training effect. The selection complexity is decreased thereafter. Compared to the conventional RBF network training methods, this method can make the transformation of spectral parameters more effective and improve the quality of converted speech. A voice conversion (VC) system was designed based on Gaussian mixture model (GMM) and radial basis function (RBF) neural network. As a voice conversion model, RBF network needs quantities of training data to improve its performance. networks trained by different segments of data have different transformation effects. Since trying the segment by segment to obtain the best conversion effect is complex, that uses GMM for statistics before training RBF network to aim at the problem. The speech transformation and representation using adaptive interpolation of weighted spectrum (STRAIGHT) model is used for accurate extraction of vocal tract spectrum. Then GMM is used to classify the numerous spectral parameters. The obtained mean parameters were trained in RBF network. ability of GMM can promptly realize the reduction and classification of training data under the premise of assurance the training ef Compared to the conventional RBF network training methods, this method can make the transformation of spectral parameters more effective and improve the quality of converted speech.
其他文献
“There is never a mountain top unreachable by hands or a road end unreachable by feet.”  When I was 5 years old, sitting by the piano, sobbed and refused to continue the practice, “I have no gift for
广东先后出台支持外经贸企业转型升级稳定发展的30项主要政策。这30项主要政策共分为八大方面。支持企业转型升级,是30项主要政策的第一部分,也是重中之重。广东省将加大财政
本文结合泰州供电局集控中心对无功电压优化自控系统(TOP-2000)的使用,论述了这种系统的主要功能和优化控制原则,并分析使用该系统所带来的效益,实际表明该系统可有效地提高
材料和方法试验在棉麦两熟制潮土上进行。供试土壤碱解氮79~82PPm,速效磷(P_2O_5) 22~26PPm,速效钾(K_2O) 53~115PPm。1985年在随州设N_0、N_10、N_15、N_20 Kg/亩四个处理。小
文章对利用新型静止无功发生器(ASVG)抑制由电弧炉引起的电压闪变进行了研究。结果表明:这种装置抑制电压闪变的效果比常规的静止无功补偿器(SVC)的要好。