论文部分内容阅读
【目的】通过构建个性化分类体系,研究面向TRIZ应用的专利自动分类方法。【方法】基于主题模型,从宏观、中观、微观三个层面构建面向TRIZ个性化分类体系;通过对不同分类特征项与算法进行组合,挑选分类准确率最高的组合构建初始分类器;采用平滑非平衡数据与特征项降维方式对分类器进行优化,完成对专利的自动分类。【结果】实现半自动构建面向TRIZ的个性化分类体系及基于该分类体系的专利自动分类。在中等数据量级场景下(千条),实现专利自动分类,分类效果综合评价指标高达90.2%。【局限】该方法不适用于数据量较小(百条)时的专利分类;在较大数据量(万条)场景下,该方法的有效性尚未得到验证。【结论】对中等规模专利数据,能快速构建面向TRIZ的分类体系,并实现自动分类。
【Objective】 To construct a personalized classification system to study the patent automatic classification method for TRIZ applications. 【Method】 Based on thematic models, a personalized TRIZ classification system was constructed from macro, meso, and micro levels. By combining different classification features and algorithms, the initial classifier was constructed by choosing the combination with the highest classification accuracy. Unbalanced data and feature reduction methods to optimize the classifier to complete the automatic classification of patents. [Results] The semi-automatic construction of personalized classification system for TRIZ and the automatic classification of patent based on the classification system were realized. In the case of medium data-size scenarios (1000), patents are automatically classified, and the comprehensive evaluation index of classification results is as high as 90.2%. [Limitations] This method is not suitable for patent classification when the amount of data is small (one hundred). The validity of this method has not been verified in the case of a large amount of data (10,000). 【Conclusion】 For medium-sized patent data, the classification system for TRIZ can be quickly constructed and automatically classified.