论文部分内容阅读
已有的大多数聚类算法都假设数据集保持不变,然而,很多应用中数据集是会随时间变化的。为此,提出了一种新的三支决策软增量聚类算法。采用区间集的形式表示类簇,区间集的上界、边界与下界就对应着三支决策产生的正域、边界域和负域,并提出了一种基于代表点的初始聚类算法。采用同样的方式对新增数据集进行一次预聚类,以消除数据处理顺序对最终聚类结果产生的影响。为了快速查找新增数据的相似区域,建立了代表点搜索树,并且给出了查找和更新搜索树的策略。运用三支决策策略完成增量聚类。实验结果表明提出的增量聚类算法是有效的。
Most of the existing clustering algorithms assume the dataset remains the same. However, in many applications the dataset will change over time. To this end, a new three-decision soft incremental clustering algorithm is proposed. The cluster is represented by interval sets. The upper bound, the lower bound and the lower bound of the interval set correspond to the positive, the boundary and the negative domains of the three decisions, and an initial clustering algorithm based on representative points is proposed. In the same way, the new dataset is pre-clustered to eliminate the influence of the data processing sequence on the final clustering result. In order to quickly find similar areas for new data, a representative point search tree is established, and a strategy for finding and updating the search tree is given. Using three decision strategies to complete incremental clustering. Experimental results show that the proposed incremental clustering algorithm is effective.