Continuous Outlier Monitoring on Uncertain Data Streams

来源 :计算机科学技术学报(英文版) | 被引量 : 0次 | 上传用户:qiuyujie
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Outlier detection on data streams is an important task in data mining. The challenges become even larger when considering uncertain data. This paper studies the problem of outlier detection on uncertain data streams. We propose Continuous Uncertain Outlier Detection (CUOD), which can quickly determine the nature of the uncertain elements by pruning to improve the efficiency. Furthermore, we propose a pruning approach - Probability Pruning for Continuous Uncertain Outlier Detection (PCUOD) to reduce the detection cost. It is an estimated outlier probability method which can effectively reduce the amount of calculations. The cost of PCUOD incremental algorithm can satisfy the demand of uncertain data streams. Finally, a new method for parameter variable queries to CUOD is proposed, enabling the concurrent execution of different queries. To the best of our knowledge, this paper is the first work to perform outlier detection on uncertain data streams which can handle parameter variable queries simultaneously. Our methods are verified using both real data and synthetic data. The results show that they are able to reduce the required storage and running time.
其他文献
近年来,怀柔区不断探索农业节水灌溉的新途径,在全区分平原、浅山以及深山,采取不同方式兴建农业节水灌溉工程,即平原以发展喷灌、微喷、滴灌为主:在浅山以小管出流灌溉为主;
随着高层建筑的不断增加,在旧城区施工采用井点降水,引起邻近建筑、管线、路面开裂下沉的现象屡见不鲜.因此,采用井点降水要特别慎重并采取相应对策.
文章通过回顾我国民办教育的发展历程,认为民办教育发展的意义在于能够提高我国教育的公平与效率,并且在进一步分析我国民办教育所存在问题的基础上,从创造良好社会环境与加
Acquiring a set of features that emphasize the differences between normal data points and outliers can drastically facilitate the task of identifying outliers.
随着社会经济的快速发展和人民生活水平的日益提高,相应的淡水资源的需求和消耗也在不断的增多.但从全国的水资源形势来看,水环境的质量越来越恶劣、水资源短缺也越来越严重,
草酸铌是近年来新兴的石油催化剂,可用于替代传统石油催化剂.文章通过对钽铌低酸度革取分离技术的研究、草酸铌生产用铌前驱体的生产和草酸铌合成的工业化技术研究几个方面,
The sudden and violent nature of coal and gas outbursts continues to pose a serious threat to coal mine safety in China. One of the key issues is to predict the
Inert surrogates can avoid husbandry and adaptation problems of live vegetation in laboratories. Surrogates are generally used for experiments on vegetation-hyd
A numerical model is proposed for the simulation of impulse waves generated by landslides. The fluid-like landslide is modeled as a generalized non-Newtonian vi
It is known that latent semantic indexing (LSI) takes advantage of implicit higher-order (or latent) structure in the association of terms and documents. Higher