Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

来源 :上海交通大学学报(英文版) | 被引量 : 0次 | 上传用户:hxz22
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
As a fundamental and effective tool for document understanding and organization,multi-document summarization enables better information services by creating concise and informative reports for large collections of documents.In this paper,we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization,known as Graph & Keyword(p).The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents.Therefore,we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary.Meanwhile,because of the differences in the words importance in documents,we propose to use keyword density for the summaries to provide rich content while using a small number of words.The experiment results show that the Graph & Keyword(p) method outperforms the state of the art systems when tested on the Duc2004 data set.
其他文献
How to query Linked Data effectively is a challenge due to its heterogeneous datasets.There are three types of heterogeneities,i.e.,different structures represe
叙事是人类最基本的一种活动,而新闻叙事是叙事活动的重要组成部分,叙事者为达到信息传播的目的并遵循某种叙事规约对某一事件的讲述。对新叙事研究的重视随着媒介竞争加剧日益
请下载后查看,本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.
期刊
In order to realize high precision of environment parameters detection in irrigation applications,a sensor and sensor network (SSN) ontology based data fusion m
我叫刘长红,陕西省白水县冯雷镇新庄村人,家有10亩8年生葡萄园,一直以来在施肥耕作中都是以尿素、二铵为主,有时施用一些复合肥,葡萄不论是产量还是质量都上不去,不知道是啥原因。  赛众公司业务经理李侠推荐我使用赛众土壤调理剂,每亩用100 kg。葡萄开花时,果穗粗壮穗长,开花集中,脱壳快,而邻居的葡萄开花时间长,脱壳慢,果穗也小。我的整个果园生机勃勃,枝叶茂盛。果穗大,每穗都在2 kg以上,果粒15
将黑龙江省1949~2010年审定的141份主栽大豆品种和3个重组自交系群体在不同地点进行种植,研究育种时期、种植年份、不同品种和种植地点对大豆株高和节数的影响规律。结果表明
魏晋南北朝期间,文学发生了巨大的变化,其中最为显著的是文学的自觉和文学创作的个性化,“徐庾体”就在此背景下应运而生。“徐庾体”这一概念最早出现于唐初修撰的《周书·庾信
甲骨文是迄今发现的最古老的成体系的文字,但其形体也带有很大的原始性。其中一个重要的构形特点是利用形符位置和方向的变化来构成形体,区别意义。甲骨文形符的位置和方向紧密
本文通过对地质常用各统计值保证率的分析计算,确定了各统计值保证率的大小及相互关系,对地质人员正确使用统计参数有一定帮助.