论文部分内容阅读
针对条件属性取值为区间型数据的离散化问题,提出了一种新的基于粗糙集理论的离散化算法.首先将粗糙集理论中上、下近似的概念进行扩展,用以描述区间数对象间的距离和相似关系,并通过定义相似度阈值来确定对象间的相似关系.为了达到用最少的离散划分区间得到较好的离散化结果,并合理地确定相似度阈值,文章给出了粗糙熵的概念.通过离散化属性的上、下近似粗糙熵值的计算以及该属性下各区间数对象的相似度矩阵的确定,可以得到该属性下最终的离散化结果.最后给出了一个烟叶质量等级评价的实例,实验结果表明该算法是有效的.
Aiming at the discretization of the conditional attribute values for interval data, a new discretization algorithm based on rough set theory is proposed.Firstly, the concept of upper and lower approximation in rough set theory is extended to describe interval number objects And the similarities between the objects and the similarity threshold between the two objects are determined by the similarity threshold.To get the best discretization results with the smallest discrete partition and to determine the similarity threshold reasonably, The concept of entropy.Finally, through the calculation of the upper and lower approximate rough entropy of discretization attribute and the determination of the similarity matrix of each interval number object under this attribute, the final discretization result of this attribute can be obtained.Finally, The example of quality grade evaluation shows that the algorithm is effective.