论文部分内容阅读
邻域系统是一种数值信息粒度计算模型,该模型可以直接分析数值型数据,拓展了经典粗糙集理论的应用范围。邻域系统中现有的增量算法基本上都是从代数观下分析其变化情况。文章从信息观角度出发,分析了当批量增加样本后,新条件熵的变化机制,并分析出决定条件熵变化的是新增批量样本的不一致邻域,进而导致约简集的变化。基于此,提出一种信息观下批增量式属性约简算法,该算法只需找到新增的不一致邻域,并与新增样本一起进行约简,避免了有重复的约简,大大地减少了计算量,从而能够迅速得到更新后的约简集。最后分析了算法的复杂度,并且通过相关的实验验证了本文算法的有效性和高效性。
The neighborhood system is a kind of numerical information granularity calculation model, which can directly analyze numerical data and expand the application range of classical rough set theory. The existing incremental algorithms in neighborhood systems basically analyze their changes from the perspective of algebra. From the perspective of information, the paper analyzes the mechanism of the change of new conditional entropy when the samples are added in bulk, and analyzes the inconsistent neighborhoods of newly added batch samples, which leads to the change of the reduced set. Based on this, an information incremental incremental algorithm is proposed, which only needs to find out the new inconsistent neighborhoods and reduce them together with the new ones to avoid repeated reductions. Reduce the amount of computation, which can quickly get the updated set of reducts. Finally, the complexity of the algorithm is analyzed, and the validity and efficiency of the proposed algorithm are verified through relevant experiments.