论文部分内容阅读
基于短语的统计机器翻译所使用的短语并不限制句法边界,因而得到的翻译结果存在大量不符合句法结构的表达.为此,分析短语模型与句法模型各自的优势与不足,提出将句法分析结果中获得的双语句法短语对融入在短语模型中,以改善机器翻译质量.首先对平行语料分别进行句法分析并从中获取中英文单语句法短语,而后利用短语表与一致性短语原则获取双语句法短语对,最后将其与短语翻译模型的原有短语表相结合.实验结果表明:双语句法短语能够提高基于短语的统计机器翻译质量,在十万句对与百万句对规模双语平行训练语料中BLEU值分别提高0.56%与0.62%.
The phrases used in phrase-based statistical machine translation do not restrict the syntactic boundaries, so there are a lot of expressions that do not conform to the syntactic structure.Therefore, the advantages and disadvantages of each of the phrase model and the syntactic model are analyzed, and the syntactic analysis results To improve the quality of machine translation.At first, parse the parallel corpus separately and obtain the Chinese-English monophony syntactic phrase from it, and then use the phrase table and the consistency phrase principle to obtain the bilingual syntactic phrase And finally combine it with the original phrase table of the phrase translation model.The experimental results show that the bilingual phrase can improve the quality of statistical machine translation based on the phrase, BLEU values increased by 0.56% and 0.62% respectively.