论文部分内容阅读
文章提出应用信息检索和信息抽取技术从互联网自动挖掘交通信息资源的方法。首先采用多引擎同时搜索并下载交通信息报道,其次进行去冗余和文本分类等预处理,然后是基于分词和词性标注的信息提取,最后形成数据库并统计分析。作为一个示例,将该方法应用在由驾驶疲劳引发的道路交通事故信息分析中,获得了较为满意的结果。
This paper presents a method of automatically mining traffic information resources from Internet by using information retrieval and information extraction technology. Firstly, multi-engine is used to search and download traffic information at the same time, followed by preprocessing such as de-redundancy and text classification, and then information extraction based on word segmentation and part of speech tagging. Finally, a database is formed and statistical analysis is made. As an example, the method is applied to the analysis of road traffic accidents caused by driving fatigue, and satisfactory results are obtained.