A text-to-speech system with high intelligibility and naturalness for Chinese

来源 :Chinese Journal of Acoustics | 被引量 : 0次 | 上传用户:xxak48
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
A Chinese text-to-speech system, which is based on the time domain PitchSynchronous-Overlap-Add (PSOLA) method, with a Chinese syllable dictionary and a prosodicrule dictionaIy, can produce very clear and natural Chinese speech. Research work on naturalness of synthetic Chinese show that, when synthesizing Chinese, pitch, energy, syllable duration and coarticulation between syllables are main factors which affect the naturalness. Among them pitch and duration play the most important roles. The time domain PSOLA scheme provides a method to modify the pitch and duration of a speech segment in time domain, and this makes it possible to adjust the prosody of speech in word level and sentence level, when synthesizing Chinese using waveform concatenation technique. Acoustics analysis of news broadcast speech provides theoretical basis for building up prosodic rules in this system. In this paper the flowchart of the new Chinese text-to-speech system, the research result of acoustics analysis of news broadcast speech, prosodic rules of the new system, and the evaluation results of speech quality of the new system are given. A Chinese text-to-speech system, which is based on the time domain PitchSynchronous-Overlap-Add (PSOLA) method, with a Chinese syllable dictionary and a prosodicrule dictionaIy, can produce very clear and natural Chinese speech. Research work on naturalness of synthetic Chinese show that, when synthesizing Chinese, pitch, energy, syllable duration and coarticulation between syllables are main factors which affect the naturalness. Among them pitch and duration play the most important roles. The time domain PSOLA scheme provides a method to modify the pitch and duration of a speech segment in time domain, and this makes it possible to adjust the prosody of speech in word level and sentence level, when synthesizing Chinese using waveform concatenation technique. Acoustics analysis of news broadcast speech provides theoretical basis for building up prosodic rules in this paper the method of the new Chinese text-to-speech system, the research result of acoustics analysis of news broadcast speech, prosodic rules of the new system, and the evaluation results of speech quality of the new system are given.
其他文献
X80管线钢的样坯采用火焰切割时,若加工余量过小,将会对试验结果产生影响;若保留足够大的加工余量,则会造成材料浪费,增加成本.针对这一问题,通过对壁厚为22mm的X80管线钢管
选择国内主要钢厂轧制的4种不同合金化成分设计的X80钢级卷板,在成型焊接工艺基本不变条件下生产西气东输二线管道工程用螺旋缝埋弧焊管.然后对焊管按照工艺要求取样,理化试
太阳照在植物园里,植物园的各种植物舒枝展叶,接受阳光的爱抚,调皮的风在枝叶间穿梭、流淌。小花、小草、小树跟着风舞蹈——摆摆头,伸伸臂,弯弯腰……一切是多么的美好!忽然
期刊
玻璃化转变温度是高分子材料由高弹态转变为玻璃态的温度,是无定型聚合物大分子链段自由运动的最低温度,通常用Tg表示.本标准规定了用差示扫描量热仪(DSC)测定生橡胶的玻璃化
介绍了天然气管道工程钢管管端无损检测的主要技术要求,通过对超声波探头的基本结构、工作原理和钢管管端坡口面缺陷分布特征的分析,并在钢管管端坡口面缺陷检测应用过程中提
毛细管流变仪是一种测量流体或半固体材料流变特性的仪器,广泛应用于塑料、橡胶等高分子材料工业,模拟挤出、注射等常见加工工艺.其工作原理可以为:通过控制活塞前进速率(可
会议
10年间非线性编辑,发展极其迅速。它是理想的编辑环境,在广告、新闻等节目的后期制作上,不需象过去的磁带系统那样往复搜索、多代复制,在选择、修改画面上既不致浪费时间,又
HA868(Ⅲ)P/TSD型电话在机是广东惠州TCL通讯设备有限公司采用电子线路设计的全集成电路免提式话机。该机不仅有完善的发送、接收系统,并具有铃声调节和免提受话音量调节、可
介绍了X80螺旋埋弧焊管内焊缺陷形貌及产生原因,通过金相观察,发现在内焊靠近熔合区出现不同程度的结晶裂纹,分析了产生结晶裂纹的主要原因是"液态薄膜",拉伸应力是产产生结
分别从提高人员素质、改善和加强管理水平、降低消耗、推进创新、顾客满意程度以及ISO 9001-2008标准体系有效运行6个方面入手, 阐述了QC小组活动在企业产品质量及生产现场管