论文部分内容阅读
传统引文分析方法中,文献间的相互关系通常由引用关系决定,也就是说,如果文献A引用文献B,则证明B对A有一定的贡献,然而具体的贡献值与引用原因却很难进行界定。采用主题模型的方法,将原著、引文、被引文献看作是主题模型上的概率分布,通过全文抽取的方法,对引用的原因以及引文贡献值进行分析。首先介绍研究背景与研究意义,并对基本概念进行阐述;然后介绍引文抽取方法、利用Labeled-LDA模型建立主题模型方法等;最后通过实验部分建立基于不同主题的文献引用网络图,并利用工具使其可视化表示。
In the traditional citation analysis method, the interrelationship between documents is usually decided by the citation relationship. That is to say, if document A references document B, it proves that B has a certain contribution to A, but the specific contribution value and the reason of quoting are very difficult to carry out Defined. Using the method of thematic model, the original, cited and cited documents are regarded as the probability distribution of the topic model. The full-text extraction method is used to analyze the causes of the cited and the contribution of the citation. Firstly, the research background and research significance are introduced, and the basic concepts are described. Then, the citation extraction method and the Labeled-LDA model are used to establish the theme model and so on. Finally, a literature-based reference network diagram based on different topics is established through experiments, Its visual representation.