论文部分内容阅读
解决Web挖掘问题的一个重要方法就是通过利用数据挖掘技术和Web技术实现Web数据挖掘。传统的基于超链接的网页搜索排序算法是纯粹地基于链接分析,存在“主题漂移”等问题。本研究根据HITS算法中存在的不足,针对其单纯考虑Web页面之间的超链接分析而不顾Web页面内容而导致的分析结果主题偏移,以及主题之间的多重加强关系问题,对HITS算法进行改进,提出了V-HITS算法,该算法增加了对Web页面内容的分析,同时赋予链接之间不同的权重,最后通过实验证明了V-HITS算法的有效性,解决了HITS算法不足问题,更加便于查找权威网页。
An important way to solve the problem of Web mining is to realize Web data mining by using data mining technology and Web technology. Traditional hyperlink-based web search sorting algorithm is based purely on the link analysis, there is “theme drift ” and other issues. According to the shortcomings in HITS algorithm, the HITS algorithm is designed based on the problem of the analysis result topic offset caused by simply considering the hyperlink analysis between Web pages and the Web page content, and the multiple reinforcement relationship among the topics The V-HITS algorithm is proposed. The algorithm adds the analysis of the content of the Web page and gives the different weights between the links. Finally, the validity of the V-HITS algorithm is proved by experiment and the problem of insufficient HITS algorithm is solved. Easy to find the authoritative website.