Closed-Set Chinese Word Segmentation Based on Convolutional Neural Network Model

来源 :第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会 | 被引量 : 0次 | 上传用户:qirongsong
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  This paper proposes a neural model for closed-set Chinese word segmentation.The model follows the character-based approach which assigns a class label to each character,indicating its relative po-sition within the word it belongs to.To do so,it first constructs shallow representations of characters by fusing unigram and bigram information in limited context window via an element-wise maximum operator,and then build up deep representations from wider contextual information with a deep convolutional network.Experimental results have shown that our method achieves better closed-set performance compared with several state-of-the-art systems.
其他文献
Most state-of-the-art models for named entity recognition(NER)rely on recurrent neural networks(RNNs),in particular long short-term memory(LSTM).Those models learn local and global fea-tures automatic
会议
Word deletion(WD)errors can lead to poor comprehension of the meaning of source translated sentences in phrase-based statistical machine translation(SMT),and have a critical impact on the adequacy of
Answer selection is a crucial subtask of the open domain question answering problem.In this paper,we introduce the Bi-directional Gated Memory Network(BGMN)to model the interactions between question a
In the last decades,named entity recognition has been extensivelystudied with various supervised learning approaches depend on massive labeled data.In this paper,we focus on person name recognition in
Enabling a computer to understand a document so that itcan answer comprehension questions is a central,yet unsolved goal of Natural Language Processing,so reading comprehension of text is an important
Generating textual entailment(GTE)is a recently proposed task to study how to infer a sentence from a given premise.Current sequence-to-se-quence GTE models are prone to produce invalid sentences when
Recently long short-term memory language model(LSTMLM)has received tremendous interests from both language and speech communities,due to its superiorty on modelling long-term dependency.Moreover,integ
Tibetan syntactic functional chunk parsing is aimed at identifyingsyntactic constituents of Tibetan sentences.In this paper,based on the Tibetan syntactic functional chunk description system,we propos
We consider the task of entity linking over question answering pair(QA-pair).In conventional approaches of entity linking,all the entities whether in one sentence or not are considered the same.We foc
Obtaining bilingual parallel data from the multilingual websites is along-standing research problem,which is very benefit for resource-scarce lan-guages.In this paper,we present an approach for obtain