论文部分内容阅读
随着互联网技术的发展,往往希望通过分析消费者已有的历史数据,推荐消费者可能感兴趣的产品,并进一步取得更好的销售记录。煤炭系统中希望通过分析用户的消费记录,从而推荐给用户潜在的煤炭产品,提高煤炭的销售量。基于用户协同过滤算法被广泛地应用在煤炭推荐系统中,基于项目的 KNN协同过滤算法是通过分析产品之间的相似性完成聚类并推荐。传统的基于项目的 KNN协同过滤算法对现有的海量规模的煤炭系统中的销售记录数据不能高效、快速地完成推荐工作。文章基于海量规模数据,提出了基于MapReduce的分布式、基于项目的KNN协同过滤算法。通过实验结果表明,文中提出的算法具有很高的加速比,并且,具有很好的可扩展性。
With the development of Internet technology, it is often hoped to recommend products that may be of interest to consumers by analyzing the historical data of consumers and further obtain better sales records. In the coal system, it is hoped that by analyzing the user’s consumption records, it is possible to recommend to the users potential coal products and increase the sales volume of coal. The collaborative filtering algorithm based on user is widely used in coal recommendation system. The KNN collaborative filtering algorithm based on project completes clustering and recommends by analyzing the similarity between products. The traditional project-based KNN collaborative filtering algorithm can not efficiently and quickly perform the recommendation work on the sales record data in the existing large-scale coal system. Based on mass data, this paper proposes a distributed, project-based KNN collaborative filtering algorithm based on MapReduce. The experimental results show that the proposed algorithm has high speedup and has good scalability.