Identifying user intent through query refinements

来源 :Chinese Journal of Library and Information Science | 被引量 : 0次 | 上传用户:nyxjm2008
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Purpose: In this paper, we attempt to use query refinements to identify users’ search intents and seek a method for intent clustering based on real world query data. Design/methodology/approach: An experiment has been conducted to analyze selected search sessions from the American Online(AOL) query logs with a two-stage approach. The first stage is to identify underlying intent by combining query co-occurrence information with query expression similarity. The work in the second stage is to cluster identified results by constructing query vectors through performing random walks on a Markov graph. Findings: Average correctness for identifying search intent is 0.74. Precision, recall, F-score values for intent clustering are 0.73, 0.72 and 0.71,respectively. The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data. Research limitations: We use the time-out threshold(15-minutc) method to group queries in one session, but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions. Practical implications: This study provides insights into the ways of understanding users’ search intents by analyzing their queries and refinements from a new perspective. The results will help search engine developers to identify user intents. Originality/value: We propose a new method to identify users’ search intents by combining session co-occurrence information and query expression similarity, and a new method for clustering sparse data. Purpose: In this paper, we attempt to use query refinements to identify users’ search intents and seek a method for intent clustering based on real world query data. Design / methodology / approach: An experiment has been conducted to analyze selected search sessions from the The first stage is to identify an underlying intent by combining query co-occurrence information with query expression similarity. The work in the second stage is to cluster identified results by constructing a query vector through Findings: Average correctness for identifying search intent is 0.74. Precision, recall, F-score values ​​for intent clustering are 0.73, 0.72 and 0.71, respectively. The results indicate that combining session co-occurrence information and query expression similarity can further filter noises and our clustering method is more suitable for sparse data. Research limitations: We use the time-out thre shold (15-minutc) method to group queries in one session, but a user may have multiple search goals at the same time and the multi-task behavior of a user is hard to capture in a session defined based on time notions. Practical implications : This study provides insights into the ways of understanding users ’search intents by analyzing their queries and refinements from a new perspective. The results will help search engine developers to identify user intents. Originality / value: We propose a new method to identify users’ search intents by combining session co-occurrence information and query expression similarity, and a new method for clustering sparse data.
其他文献
水资源不足是世界主要农业产区普遍存在的问题,而随着农业生产的发展,我国水资源形势也日益严峻。作为我国重要农区的华北平原水资源严重缺乏。水资源已成为制约我国农业生产和
日本塌塌米类似中国的草席。但是加工塌塌米的席草要“长、细、软、绿”,而我国种植的席草,一般都较短、较粗、较硬,只适宜于加工内销草席。两年前,黄岩市农业技术推广中心
本研究针对京郊地区冬小麦-夏玉米轮作系统中氮素损失严重的问题,于2012年至2013年在北京市农林科学院温室和北京市房山区农业科学院农业面源污染试验站分别布置盆栽试验和原位土柱定位试验,采用15N示踪技术研究氮素去向及损失途径,研究冬小麦-夏玉米轮作系统中氨挥发损失、N20排放损失和氮素淋溶损失。主要结果如下:1.通过15N示踪对氮肥在土壤/作物体系中去向的结果表明,氮肥用量在22~266mg/k
报纸上刊登消息,在导语的开头部分往往冠以“本报讯”或“××新闻通讯社××地×月×日电”的字样。这就是所说的“消息头”。消息头,是消息的标志。然而许多企业报却忽略
化肥在保障粮食安全上发挥着重要作用,为追求高产,部分地区肥料的盲目施入已导致土壤养分失衡、肥料利用率降低和施肥效益下降等问题日益严重。测土配方施肥是一项制定科学施
采集从化市鳌头镇农业土壤,分析土壤有机质、碱解氮、有效磷、速效钾、交换性钙、交换性镁、有效硫、有效铁、有效铜、有效锰、有效锌、有效硼和土壤pH值,根据土壤养分等级划分
本文通过对荣华二采区10
期刊
读了去年12月31日的《中国青年报》,心里感到热乎乎的:在“感谢你们!为本报增光添采的通讯员”的大标题下,该报用第2版整版的篇幅报道了20多位优秀通讯员的事迹,并附有他们
水稻是世界第二大粮食作物,占世界人口总摄入食物的20%。磷是维持植物生命活动的必需营养元素,几乎一半的水稻土壤缺磷,磷缺乏成为水稻产量重要的限制因素。在缺磷土壤中水稻吸收
据西德李希特公司第二次推测(1975年2月18日),1975/76年制糖期世界糖产量为8,276万吨,其中甜菜糖为3,305吨。与1974/75年制糖期比较,甜菜糖约增11%。