Enhancing Chinese Word Embeddings from Relevant Derivative Meanings of Main-Components in Characters

来源 :第十八届中国计算语言学大会暨中国中文信息学会2019学术年会 | 被引量 : 0次 | 上传用户：oskarguan

【摘要】

：

【作者】

：

Xinyu Su Wei Yang Junyi Wang

【机构】

：

School of Computer Science and Technology,University of Science and Technology of China,Hefei,230027

【出处】

：

第十八届中国计算语言学大会暨中国中文信息学会2019学术年会

【发表日期】

：

2019年8期

【关键词】

：

Relevant derivative meaning Component level Enhanced word embedding

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Word embeddings have a significant impact on natural lan-guage processing.In morpheme writing systems,most Chinese word em-beddings take a word as the basic unit,or directly use the internal structure of words.However,these models still neglect the rich rele-vant derivative meanings in the internal structure of Chinese charac-ters.Based on our observations,the relevant derivative meanings of the main-components in Chinese characters are very helpful for improving Chinese word embeddings learning.In this paper,we focus on employing the relevant derivative meanings of the main-components in the Chinese characters to train and enhance the Chinese word embeddings.To this end,we propose two main-component enhanced word embedding models named MCWE-SA and MCWE-HA respectively,which incorporate the relevant derivative meanings of the main-components during the training process based on the attention mechanism.Our models can fine-grained enhance the precision of word embeddings without generating additional vectors.Experiments on word similarity and syntactic analogy tasks are conducted to validate the feasibility of our models.Furthermore,the re-sults show that our models have a certain improvement in the similarity task over most baselines,and have nearly 3%improvement in Chinese analogical reasoning dataset compared with the state-of-the-art model.

其他文献

Modeling the Long-term Post History for Personalized Hashtag Recommendation

Hashtag recommendation aims to recommend hashtags when social media users show the intention to insert a hashtag by typing in the hashtag symbol “#” while writing a microblog.Previous methods usually

会议

Hashtag recommendationLong-term post historyNeural memory network

Improving Relation Extraction with Relation-Based Gated Convolutional Selector

Distant supervision is an effective way to collect large-scale training data for relation extraction.To better solve the wrong labeling problem accompanied by distant supervision,some methods have bee

会议

基于语料库的我国职业性别无意识偏见共时历时研究

性别偏见是社会学研究的热点.近年来,机器学习算法从数据中学到偏见使之得到更广泛的关注,但目前尚无基于语料库的方法对文本数据中职业性别偏见的研究.该文基于标记理论,利用BCC和DCC语料库,从共时和历时两个层面考察了63个职业的性别无意识偏见现象.首先,以调查问卷的形式调研了不同性别和不同年龄段的人群对63个职业的性别倾向,发现和BCC语料库中多领域的职业性别偏见度呈显著的正相关.然后从共时的角度,

会议

职业环境性别倾向无意识偏见语料库

Contextualized Word Representations with Effective Attention for Aspect-based Sentiment Analysis

Aspect-based sentiment analysis(ABSA)aims at identifying sentiment polarities towards aspect in a sentence.Attention mechanism has played an important role in previous state-of-the-art neural models.H

会议

Aspect-based sentiment analysisSelf attentionCo-attention

Colligational Patterns in China English:the Case of the Verbs of Communication

This present study aims to investigate the colligational structures in China English.A corpus-based and comparative methodology was adopted in which three verbs of communication(discuss,communicate an

会议

ColligationChina EnglishLanguage ContactCorpus of China EnglishComputational

Encoder-Decoder Network with Cross-Match Mechanism for Answer Selection

Answer selection(AS)is an important subtask of question answering(QA)that aims to choose the most suitable answer from a list of candidate an-swers.Existing AS models usually explored the single-scale

会议

Answer SelectionMulti-PerspectiveCross-Match Mechanism

Capsule Networks for Chinese Opinion Questions Machine Reading Comprehension

In recent years,machine reading comprehension is becoming a more and more popular research topic.Promising results were obtained when the machine reading comprehension task had only two inputs,context

会议

Capsule NetworksMachine Reading ComprehensionMultiway Attention

A Document Driven Dialogue Generation Model

Most of the current man-machine dialogues are at the two end-points of a spectrum of dialogues,i.e.goal-driven dialogues and non goal-driven chitchats.Document-driven dialogues provide a bridge betwee

会议

Document-driven dialogueDoc-ReaderMulti-Copy

Testing the Reasoning Power for NLI Models with Annotated Multi-perspective Entailment Dataset

Natural language inference(NLI)is a challenging task to determine the relationship between a pair of sentences.Existing Neural Network-based(NN-based)models have achieved prominent success.However,rar

会议

Natural Language InferenceMulti-perspective Entailment Category Labeling System

Table-to-Text Generation via Row-Aware Hierarchical Encoder

In this paper,we present a neural model to map structured table into document-scale descriptive texts.Most existing neural net-work based approaches encode a table record-by-record and generate long s

会议

Table-to-Text GenerationSeq2SeqHierarchical Encoder

Enhancing Chinese Word Embeddings from Relevant Derivative Meanings of Main-Components in Characters

与本文相关的学术论文