论文部分内容阅读
针对情感词识别及情感词库构建效率不高的问题,提出一种自动提取基准情感词集的方法,从词频、词的领域性情感倾向和词的情感强度三方面进行基准词筛选,再凭借目标词与正、负基准词集的不同语义相似度进行情感词的识别和情感倾向的判断,使机器能够自动完成大部分工作,提高效率,降低构建不同领域情感词库的成本。以京东商城71061条评论和卓越网1736条评论为数据集进行实验,获得的召回率为76.36%,准确率为76.94%,情感倾向判断的准确率为62.70%。
Aiming at the problem that the recognition of emotion words and the construction of emotion lexicon are not efficient, a method of automatically extracting reference emotion words is proposed. The reference words are selected from three aspects of word frequency, regional sentiment orientation of words and sentiment intensity of words. The semantic similarity between the target words and the positive and negative benchmark words sets to recognize the emotion words and the judgment of the affective tendencies so that the machine can do most of the work automatically and improve the efficiency and reduce the cost of constructing the thesaurus in different fields. Taking Jingdong Mall 71061 comments and Joyo.com 1736 comments as data sets for experiments, the recall rate was 76.36%, the accuracy rate was 76.94%, and the accuracy rate of emotional inclination judgment was 62.70%.