论文部分内容阅读
针对蒙文多文种文本(如传统蒙文TM,新蒙文NM及托忒文Todo)的互换显示需求,本文研究了基于短语的统计机器翻译技术的自动转写方法。首先,人工建立上述三文平行6万条句对语料。其次,利用TM和Todo(NM)双文句对中,词间空格信息对,TM功能词与前词强制连接,生成双文句及词对齐语料,并生成统计翻译模型和语言模型。最后,借助于Moses解码器实现双文的自动转写。实验分别用300开发句和测试句进行TM-Todo句文双向互译时,其BLEU值分别达到了57.82%和58.03%,比先前汉-蒙语机器翻译最好BLEU值:29.86%,近高一倍。
In view of the need for the exchange between Mongolian and multilingual texts (eg traditional Mongolian TM, new Mongolian NM and Todo Todo), this paper studies the automatic transliteration of phrase-based statistical machine translation. First of all, the manual establishment of the above three parallel 60,000 sentence pairs corpus. Secondly, using TM and Todo (NM) double sentence pairs, the space information between pairs of spaces, the TM function words and the former words are forcibly connected to generate double sentence and word alignment corpus, and generate statistical translation model and language model. Finally, with the help of Moses decoder to achieve the automatic transfer of double text. The BLEU values of TM-Todo sentences were 57.82% and 58.03% respectively when compared with 300 development sentences and test sentences in the experiment. The best BLEU value was 29.86% compared with the previous Chinese-Mongolian machine translation, which was nearly high Double