论文部分内容阅读
随着互联网规模的日益增长,搜索引擎已经成为互联网上有效的信息获取工具.而在众多搜索引擎的背后,是信息检索技术,也即网页排序算法在起作用.网页排序包括重要性排序和相关性排序.通过我们研究发现,尽管这两类排序所依据的准则不同,但是都可以通过建立适当的随机过程模型来研究.对于网页重要性排序,我们通过分析用户浏览网页的行为建立了Markov骨架过程的框架.基于该框架我们分析了三种不同的随机过程模型对用户行为模拟的合理程度,并设计了名为BrowseRank的一组新算法,该算法可以根据用户上网行为来计算网页的重要性.在网页相关性排序中,我们主要针对排序结果联合问题建立了一个基于Markov链的监督学习框架.通过将传统方法的监督化,使原来难于解决的问题变的易于学习,将原来的NP-难问题转化为一个半正定规划问题,提高了效率.
With the increasing size of the Internet, search engine has become an effective tool for obtaining information on the Internet, and behind many search engines, information retrieval technology, that is, web page sorting algorithm at work.Web page sorting, including the importance of sorting and related We found that although the two types of ranking are based on different criteria, they can be studied by setting up an appropriate stochastic process model.For the ranking of web page importance, we construct a Markov framework by analyzing the behavior of users browsing web pages Process framework.According to this framework, we analyze the reasonableness of three different stochastic process models to simulate user behavior and design a new set of algorithms called BrowseRank, which can calculate the importance of web pages according to the behavior of users In the web page relevance ranking, we mainly establish a Markov chain-based supervisory learning framework for the joint result of the ranking result.On the basis of supervising the traditional method, making the original difficult to solve the problem easy to learn, the original NP- The difficult problem transforms into a semi-definite programming problem, which improves the efficiency.