论文部分内容阅读
k-NN分类算法已广泛应用于文本挖掘和模式识别等领域,其近邻数直接影响着分类精度,值过小时k-NN会受到噪声的影响,值过大时同样会降低分类精度,为此提出一种快速选取值的方法.首先给出值的候选集,然后在候选集上快速地选取值.在100个公开数据集上的实验结果表明,所提出的算法能够选取一个有效的近邻数,是一种效果好、有潜力的方法.
The k-NN classification algorithm has been widely used in the fields of text mining and pattern recognition. The nearest neighbor number directly affects the classification accuracy. If the value is too small, the k-NN will be affected by noise. If the value is too large, the classification accuracy will also be reduced. A fast method of selecting values is proposed. First, the candidate sets of values are given, and then the values are quickly selected on the candidate sets. The experimental results on 100 open data sets show that the proposed algorithm can select a valid Neighbors, is an effective, promising method.