论文部分内容阅读
In this talk,we will first give an overview of research on data selection for machine translation domain adaptation.Then,we will introduce a recently proposed method which uses semi-supervised convolutional neural networks(CNNs)to select in-domain training data.This approach is particularly effective when only tiny amounts of in-domain data are available,which makes fine-grained topic-dependent translation adaptation possible.This method performs significantly better than several state-of-the-art data selection methods on several public domain test sets.Finally,we will talk about the ongoing work which extends the CNN-based method to select in-domain data with good translation quality.