论文部分内容阅读
对特定事件的子话题特征提取,能够帮助我们挖掘当前用户关注的重点和细节,更深层次探索事件话题的语义特征。本文利用LDA主题模型对特定事件的微博进行主题建模,设计了主题顶层差异度和融合度对相似子话题进行融合,并合理利用科学的先验知识确定子话题数量,避免了以往基于专家知识确定话题数量的偏移,同时设计选择算法对子话题的候选关键词和主题微博进行标记,从而更好的描述子话题的类型和内容。
Extracting the subtopic features of a specific event can help us tap the focus and details of the current user concerns and further explore the semantic features of the event topics. In this paper, we use the LDA theme model to model the topic-specific Weibo, design the theme top-level difference and the degree of fusion to fuse the similar sub-topics, and make reasonable use of scientific prior knowledge to determine the number of sub-topics, avoiding the previous expert- Knowledge determines the offset of the number of topics, and designs the selection algorithm to mark the candidate keywords and the topic microblogs of the sub-topics so as to better describe the types and contents of the sub-topics.