网络搜索的个性化潜力研究

来源 :哈尔滨工业大学 | 被引量 : 0次 | 上传用户:wonghost
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
People use Web search engines to look for information on the Web However,current Web search engines do not fully satisfy the needs of different individuals having unique search goals for the same query We examine the variations in Web search results individual’s find to be relevant for the same query by using explicit relevance judgments and clicks as a source of evidence for relevance Besides,We identify truly ambiguous queries and examine how-both explicit relevance judgments and implicit measures of relevance vary for these queries.  In this thesis,we propose an enhanced framework for personalized search,which contains a new component that can assist users in a decision making process as to whether the query is a potential candidate for personalization or not It considers the potential category of a query and then applies personalization techniques depending on the category to which the query belongs Besides,the new framework considers the differences in interest among different individuals for the results of the same query,which is useful for identifying the type of individuals benefiting from personalization.  We then investigate the variations in explicit relevance judgments for Web search results of same query from the view point of different query categories The result we get shows that there are great variations in judgment among different individuals who evaluate the search results of the same query For those queries for which have relevance judgments from multiple people,it is also possible to find the best possible result ranking for an individual and for different sized groups of individuals.With an increase in the number of people in group,the gap between user satisfaction with the individual search result rankings and group search result ranking increases. This observed gap is the potential for personalizing search and quantified using the normalized discounted gain. This verifies the potential benefit of personalizing search. In this section,we also examine the potential for personalizing search from the perspectives of different query categories. Firstly,we classify queries into clear and unclear,and investigate their potential for personalization. We found that unclear queries have greater potential for personalization than clear queries. This might be a result of the various search intents people have for the same unclear queries when they look for information on the Web Secondly,we further classify unclear queries into broad and ambiguous and examine their potential for personalization separately. We found that generally ambiguous queries have greater potential for personalization than broad queries,while clear queries have the least potential for personalization The result of our experiment reveals that there is the need to examine the potential for personalization from the perspectives of different query categories and this opens up a new research dimension for further investigation in which personalization algorithms should be applied to different query categories in different manners.  We then try to explore how different people perceive search results for the same query by mining clicks,which act as a proxy for relevance,from a large query log data. In an attempt to investigate the variability in what people are searching for when they issue the same query,we only select queries from query log for which we have clicks from at least eight individuals. Extensive analysis of clicks on search result for the same query reveals that there are variations in implicit judgments across individuals indicated in the potential for personalization curve constructed by considering different group sizes. It shows that there is an observable gap between individual preferences and best group preferences for the results of the same query and this gap increases as the number of people in the group increases. This tells US how much room is there to improve the search results through personalization.  We finally try to examine the potential for personalization of truly ambiguous queries by extracting them from query log. Firstly,we use user entropy and its derivates as input features for classifiers to distinguish informational and ambiguous queries characterized by similar click distribution. We then use potential for personalization curve to investigate the potential of ambiguous queries for personalization at different group sizes as measured by the average normalized discounted gain. The result we get shows that ambiguous queries have greater potential for personalization for all low,medium and high frequency classes.Therefore,we suggest that identifying query ambiguity is the way forward for personalization. If we are able to identify truly ambiguous queries beforehand,we can devote all our resources to apply personalization only to queries identified to be truly ambiguous than uniformly applying to all queries.
其他文献
近年来,概念格理论作为一种有效的数据处理方法,得到了很大的发展。它在各种形式背景的知识表示和知识发现中发挥着独特优势。  目前,有关概念格理论的各种研究主要针对经典形
SIP(Session Initiation Protocol,会话初始化协议)作为IP网络中的信令协议,其主要功能是建立和管理呼叫,具有简单、灵活、可扩展性强的优点,是下一代网络(NGN)的核心协议之一。
随着云计算商业模式的飞速发展,越来越多的软件企业进入SaaS(Software as aService,软件即服务)市场,降低成本提高效益成为SaaS提供商迫切需要解决的问题。把传统软件迁移成SaaS软
数字矿山的提出象征着我国矿业已进入数字化矿产科技时代。如何顺应甚至超越世界矿产科研潮流就成为当前国内外研究领域的热点之一。数字矿山不同于经典矿业,它必须有计算机和
人脸表情识别是一个非常具有挑战性和非常有意义的课题,它不仅涉及到计算机视觉、心理学、生理学等相关学科,还涉及到模式识别和图像处理技术等。随着科技的进步,人们对计算机的
目前,互联网中存在大量的电子科技论文档案,如何管理和分析这些文档已经变得越来越重要。本文以此作为切入点,着重挖掘论文集中潜在的研究主题以及主题的动态演进规律。本文采用
信息服务是用不同的方式向用户提供所需信息,帮助用户解决问题的一项活动。随着信息技术的飞速发展,其内涵和外延也正在不断地扩展和变化。针对我国农业生产、管理、科研等各方
传统的电量抄读及结算是依靠人工定期到现场抄取数据,存在效率低、成本高、无法监控等问题,在实时性、准确性和应用性等方面也存在诸多不足之处,已经无法满足现代化城市的需要。
随着信息化进程的不断推进,大部分企业已经把数据仓库作为企业数据集成的主流技术,为企业提供综合的真实的数据视图。而数据的抽取、转换、加载(Extraction、Transformation、Lo
近年来,无线网络的发展非常的迅速,但与其相关的网络安全问题也随之而来。其最主要的安全问题的是无线网络中的恶意节点问题。恶意节点是多种有害无线网络安全行为的一种统称,其