论文部分内容阅读
粗糙集理论是一种有效的属性约简方法,但不能直接处理实值数据。针对此问题,本文首先介绍了邻域和覆盖的概念,在此基础上构造了覆盖自约简和覆盖间约简(属性约简)算法;然后通过讨论邻域内各样本之间关系,提出了相斥元的定义,相斥元的存在可能导致决策正域计算错误,从而得到不符合数据表实际情况的属性依赖性,因此给出了分解相斥元的方法;最后在四个实值的基因表达数据库上进行了实验,结果表明该属性约简算法是有效的,并相对于现有其他算法具有较高的分类精度。
Rough set theory is an effective attribute reduction method, but it can not deal with the real value data directly. In order to solve this problem, this paper first introduces the concept of neighborhood and coverage, and then constructs the algorithm of coverage reduction and attribute reduction based on it. Then, by discussing the relationship between the samples in the neighborhood, The definition of repulsion yuan, the existence of repulsion yuan may lead to the wrong calculation of the positive territory of the decision-making, so as to get the attribute dependence that does not accord with the actual situation of the data table. Therefore, a method of decomposing the repulsion yuan is given. Finally, Experiments on gene expression database show that the algorithm of attribute reduction is effective and has higher classification accuracy than other existing algorithms.