论文部分内容阅读
Learning multi-lingual sentence embeddings usually requires large scale of parallel sentences which are difficult to obtain.We propose a novel self-learning approach which is capable of learning multi-lingual sentence embeddings from monolingual corpora.Our assumption is that,irrelevant to languages,sentences appearing in similar contexts are simi-lar.Thus,we first train monolingual sentence embeddings of different languages with shared parameters as initialization.Then we iteratively extract similar sentence pairs and exchange their positions regardless of languages.Through their relations to their new contexts we predict the similarities between a similar sentence pair.Our experiments show that the proposed approach outperforms existing unsupervised approaches and is competitive to supervised approaches.