论文部分内容阅读
为提高超文本分类算法的性能,降低算法的复杂度,提出一种适用于超文本分类的加权超球支持向量机算法.该算法综合文档内容信息和超链接信息作为文档特征向量,针对传统超球支持向量机算法在不同类别样本数目不均衡时训练分类错误倾向于样本数目小的类别的问题,利用加权因子补偿了类别差异对算法推广性能造成的不利影响.在基准数据集上的测试结果表明,该算法降低了二次规划的复杂度,提高了分类器的分类性能.
In order to improve the performance of Hypertext classification algorithm and reduce the complexity of the algorithm, a weighted Hypersphere Support Vector Machine (SVM) algorithm for Hypertext classification is proposed.This algorithm combines document content information and hyperlink information as document feature vectors, The ball support vector machine (SVM) algorithm, when the number of different types of samples is not balanced, trains the classification errors which are prone to the small number of samples, and compensates for the adverse effect of category differences on the generalization performance of the algorithm by using the weighting factors. The test results on the benchmark dataset It shows that this algorithm reduces the complexity of quadratic programming and improves the classification performance of classifiers.