论文部分内容阅读
分析了贝叶斯垃圾邮件过滤器的工作原理、分词、特征提取等相关技术,研究了Java Mail以及邮件相关标准和协议,设计了基于贝叶斯的垃圾邮件过滤系统;实现了服务器端的训练集管理器和客户端的邮件分类器、简易的邮件收发系统三大功能模块;在对邮件的处理中增加了人工复检和特征串匹配降噪的二次处理来完善过滤系统。
The paper analyzes the related technologies of Bayesian spam filter, such as working principle, word segmentation, feature extraction and so on, studies Java Mail and related standards and protocols of mail, designs a Bayesian spam filtering system, and implements server-side training set Manager and client mail classifier, simple e-mail sending and receiving system of the three major functional modules; in the processing of e-mail to increase the artificial re-inspection and feature matching string noise reduction secondary processing to improve the filtering system.