【摘 要】
:
Phrase Treebank is an important resource for Natural Language Processing research and practical application.For Vietnamese,we lack this kind of Treebank res
【机 构】
:
The School of Information Engineering and Automation,Kunming University of Science and Technology,Co
论文部分内容阅读
Phrase Treebank is an important resource for Natural Language Processing research and practical application.For Vietnamese,we lack this kind of Treebank resources.This paper presents a method to construct the Vietnamese phrase Treebank by fusion of Vietnamese grammatical features and improved PCFG.This method can automatically analyze Vietnamese phrase structure tree and solve the problem of constructing the Vietnamese phrase Treebank.Firstly,Vietnamese grammatical feature set is established by analysis of Vietnamese grammatical features.Then,grammar rule set of PCFG model is obtained from manual annotation Vietnamese phrase trees.Finally,Vietnamese grammatical feature set is fused into improved PCFG model,which is regarded as a supplement,and the method completes the construction of Vietnamese phrase Treebank.The experimental results show that the accuracy of proposed PCFG model for the Vietnamese phrase Treebank construction reaches 89.12%.Compared to conventional PCFG model and the maximum entropy method,the accuracy obviously is improved.
其他文献
黑龙江是全国农业大省,也是畜牧业大省,更是草业大省.黑龙江是世界三大黑土带之一,全省拥有耕地2.29亿亩,农业人口人均耕地面积13亩左右,是全国平均水平的近10倍,耕地总面积
青藏高原是目前世界范围内最为独特的草地生态生态系统之一,该生态系统的稳定性及是否退化严重影响中国乃至全球生态安全.同时该生态系统面临超载放牧、鼠害频发、毒草丛生的
玉米秸秆是反刍动物粗饲料的重要来源之一,利用微生物发酵处理玉米秸秆,可提高其营养价值,促进消化吸收.本研究在实验室发酵秸秆饲草研发的基础上,分析了发酵玉米秸秆各营养
21世纪以来,短短的15年,中国草产业经历了三次具有历史意义的裂变式振兴.第一次振兴是国家生态保护建设的需求,第二次振兴是国家奶品质量安全的需求,第三次振兴是国家农业结
In this paper,for the low similarity computation accuracy of concept in the field of domain ontology mapping,formal concept analysis theory and rough set th
In order to avoid the influence from invalid rules in decoding,a method based on translation rules optimization is proposed for machine translation automati
In this paper,we propose a novel approach learning bilingual representations to predict quality estimation of machine translation.We use two bi-directional
Commas are widely distributed and used in Chinese and play important role in detecting boundary of basic units in sentences and discourses.Towards Chinese-E
语言本身的复杂性给机器翻译带来了巨大的困难,对机器翻译结果的详细分析有助于有针对性地提高翻译系统的质量。本文在分析了现代汉语中情态词的分类基础上,选择了四组测试
本文首先对机器翻译引擎的构建思路进行了全面的规划,简要介绍了 Moses 3.0 系统及其特性,理清了引擎的构建思路、形成了引擎构建的总体规划.随后把机器翻译引擎的构建与部