论文部分内容阅读
为避免信息超载而在过多的无用信息中迷失方向,信息检索的重要性日益提高。文本自动分类是信息检索中最重要的工具之一。提出了一个用于中文文本自动分类的、称为关联规则辅助的遗传计算方法(AssociationRulesAidedGeneticComputingMethod,缩写为ARGCM)。提出并实现了文本分类的关联规则辅助的遗传算法;不同于前人的路线,适应度函数的编码借助了关联规则,而关联规则通过此文提出的ARGACM算法挖掘;实现了并测试了一系列基础遗传过程,例如AGACMRouletteSelection过程,AGACMXover过程和AGACMbinaryMutatio过程;实验结果表明新的ARG算法性能远优于传统的算法,其中向量AB Vector经过50代ARG算法的进化后,获得了高达3513.6的评分。
In order to avoid information overload and lost in the direction of too much useless information, the importance of information retrieval increasing. Automatic text categorization is one of the most important tools in information retrieval. A AssociationRulesAided Genetic Computing Method (ARGCM) is proposed for automatic classification of Chinese texts. Proposed and implemented text classification association rules-assisted genetic algorithm; different from the predecessor’s line, fitness function coding with the help of association rules, and association rules mining ARGACM algorithm proposed in this paper; achieved and tested a series of The basic genetic process, such as AGACMRouletteSelection process, AGACMXover process and AGACMbinaryMutatio process, shows that the performance of the new ARG algorithm is far superior to the traditional algorithm. The Vector AB Vector achieves a score of 3513.6 after 50 generations of ARG evolution.