论文部分内容阅读
特征工程是产品垃圾评论识别研究中的关键技术之一,绝大多数现有的垃圾评论检测方法都是根据先验知识进行特征选择与指标定义,这类方法主观性过强从而难以应用推广.以电子商务平台“天猫”热销产品评论为研究对象,提出基于评论数据预分析的垃圾评论识别特征工程,然后运用决策树进行垃圾评论检测.实验表明,与其它基于先验的特征工程相比较,该方法能有效提升垃圾评论分类的效果.
Feature engineering is one of the key technologies in the product spam review. Most of the existing methods of garbage comment detection are based on prior knowledge for feature selection and index definition. Such methods are too subjective to apply and promote. This paper takes the e-commerce platform “Lynx” hot product reviews as the research object, and proposes a garbage comment recognition feature project based on the pre-analysis of the comment data, and then uses the decision tree to detect the spam reviews. Experiments show that, compared with other prior-based features Compared with the project, this method can effectively improve the effectiveness of spam classification.