Speaker conversion using kernel non-negative matrix factorization

来源 :中国邮电高校学报(英文版) | 被引量 : 0次 | 上传用户:LinChu41
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Voice conversion (VC) based on Gaussian mixture model (GMM) is the most classic and common method which converts the source spectrum to target spectrum.However this method is prone to over-fitting because of its frame-by-frame conversion.The VC with non-negative matrix factorization (NMF) is presented in this paper,which can keep spectrum from over-fitting by adjusting the size of basis vector (dictionary).In order to realize the non-linear mapping better,kernel NMF (KNMF) is adopted to achieve spectrum mapping.In addition,to increase the accuracy of conversion,KNMF combined with GMM (GKNMF) is also introduced into VC.In the end,KNMF,GKNMF,GMM,principal component regression (PCR),PCR combined with GMM (GPCR),partial least square regression (PLSR),NMF correlation-based frequency warping (NMF-CFW) and deep neural network (DNN) methods are compared with each other.The proposed GKNMF gets better performance in both objective evaluation and subjective evaluation.
其他文献
The reachability problem of synchronizing transitions bounded Petri net systems (BPNSs) is investigated in this paper by constructing a mathematical model for dynamics of BPNS.Using the semi-tensor product (STP) of matrices,the dynamics of BPNSs,which can
The rapid development of location-based social networks (LBSNs) has provided an unprecedented opportunity for better location-based services through point-of-interest (POI) recommendation.POI recommendation is personalized,location-aware,and context depen
The authentication codes with arbitration are able to solve dispute between the sender and the receiver.The authentication codes with trusted arbitration are called A2-codes,the authentication codes with distrust arbitration are called A3-codes.As an expa