论文部分内容阅读
随着学术研究规模的急剧扩张,学术期刊的发文量急剧增加,提升中文期刊数据库中的检索效用变得相当困难。介绍了一种对中文数据库中检索出来的文献进行聚类的方法。该方法基于关键词共现分析,通过提取每篇文档的关键词,然后运用统计方法得出关键词共现矩阵,利用层次聚类算法对关键词进行聚类,形成层次树,并且根据聚类结果对文献进行分类。该方法可以对中文期刊数据库检索结果进行分类,使用户准确定位到自己感兴趣的文章。
With the rapid expansion of academic research, there is a sharp increase in the number of academic journals published. It becomes quite difficult to improve the retrieval effectiveness in Chinese journal databases. A method of clustering documents retrieved from Chinese databases is introduced. Based on the co-occurrence analysis of keywords, this method extracts the key words of each document, and then uses the statistical method to derive the keyword co-occurrence matrix. The key words are clustered using the hierarchical clustering algorithm to form a hierarchical tree. According to the clustering Results The literature was classified. This method can classify the search results of Chinese periodical databases so that users can accurately locate the articles they are interested in.