论文部分内容阅读
针对互联网的数据挖掘在“棱镜”计划中扮演着至关重要的角色。文中首先对数据挖掘的基本技术原理进行了分析,包括关联分析和机器学习的常用算法。然后介绍了互联网信息检索和挖掘的主要技术。接下来提出了一种基于开源云计算平台的互联网大数据挖掘系统架构。最后,对互联网大数据挖掘的发展指出了方向。
Data Mining for the Internet plays a crucial role in the Prism project. The paper first analyzes the basic technical principles of data mining, including the commonly used algorithms of association analysis and machine learning. Then introduced the main technology of Internet information retrieval and mining. Then put forward a kind of Internet big data mining system architecture based on open source cloud computing platform. Finally, the direction of the development of Internet big data mining is pointed out.