,Semantic composition of distributed representations for query subtopic mining

来源 :信息与电子工程前沿(英文版) | 被引量 : 0次 | 上传用户:yesterday23
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Inferring query intent is significant in information retrieval tasks.Query subtopic mining aims to find possible subtopics for a given query to represent potential intents.Subtopic mining is challenging due to the nature of short queries.Leaing distributed representations or sequences of words has been developed recently and quickly,making great impacts on many fields.It is still not clear whether distributed representations are effective in alleviating the challenges of query subtopic mining.In this paper,we exploit and compare the main semantic composition of distributed representations for query subtopic mining.Specifically,we focus on two types of distributed representations:paragraph vector which represents word sequences with an arbitrary length directly,and word vector composition.We thoroughly investigate the impacts of semantic composition strategies and the types of data for leaing distributed representations.Experiments were conducted on a public dataset offered by the National Institute of Informatics Testheds and Community for Information Access Research.The empirical results show that distributed semantic representations can achieve outstanding performance for query subtopic mining,compared with traditional semantic representations.More insights are reported as well.
其他文献
以水稻品种越富为材料,首次把IPT作为抗逆诱导剂来研究它对4-6℃低温胁迫下的水稻旱育秧苗根系及叶片中POD活性及其同工酶、CAT活性、可溶性蛋白含量及其SDS-PAGE图谱的影响
本研究以株1S/广粳1号、陆18S/广粳1号和湘早2-2S/广粳1号(分别用A、C和E表示)等3个两系亚种间超级稻新组合与2个两系法超级稻对照组合培矮64S/9311和培矮64S/E32(分别用D和B表示)及1个三系法高产对照组合汕优63(用F表示)为材料,在5个播期(4月5日、4月20日、5月5日、5月20日、6月4日播种)条件下比较研究了3个新组合的产量及主要经济性状、生育期、生长发育动态、冠
The emerging memory technologies,such as phase change memory (PCM),provide chances for high-performance storage of I/O-intensive applications.However,traditiona
Penetration testing offers strong advantages in the discovery of hidden vulnerabilities in a network and assessing network security. However, it can be carried
采用随机扩增多态DNA(RAPD)技术对全国五省八份丹参数药材进行遗传多样性研究. 实验共筛选出32个随机引物,扩增出131条DNA片段在这些多态位点,共找出了六个丹参品种的14个分
中小学教师继续教育启动几年来,获得了长足发展,也取得了引人注目的成就.但是也要看到,作为一种制度,继续教育还没有达到规范化和法制化的要求,尤其是难以克服的人事与教育两
本研究选用6个小麦亲本材料,按照Griffing方法Ⅱ组配了一套双列杂交组合,研究了谷蛋白大聚合体(GMP)含量的配合力、杂种优势、遗传力及遗传模型,分析了其与农艺性状的相关关系;利用261份小麦种质资源,分析了GMP含量与烘烤品质的相关关系;选用5个来自4个地点的小麦品种材料,研究了GMP含量与环境的关系。结果表明: (1)GMP含量与蛋白质含量、沉淀值均呈极显著正相关关系。偏相...
该研究以河南大面积推广的小麦品种豫麦18-64、豫麦34、豫麦70、济麦1号四个品种的幼胚及其愈伤组织为受体材料,对其组织培养条件进行了优化;用基因枪法对携有报告基因(Bar基