Improving naive Bayes classifier by dividing its decision regions

来源 :Journal of Zhejiang University-Science C(Computers & Electro | 被引量 : 0次 | 上传用户:liwang0113
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a decision tree can be regarded as a classifier tree,in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node.Meanwhile,the NBTree algorithm,which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively,can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier.These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers.The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries,and those that may be ignored by the naive Bayes classifier.Finally,we conduct experiments on 30 data sets from the UC Irvine (UCI) repository.Experiment results show that the SD algorithm can obtain better generali-zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers.Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately. Classification can be considered as dividing the data space into decision regions separated by decision boundaries. In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective .hus, a decision tree can be regarded as a classifier tree, in which each classifier on a non-root node is trained in the decision regions of the classifier on the parent node. Meanwhile, the NBTree algorithm, which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively, can also presented as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier. These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers. SD and three SD-soft algorithms can make good use of both the in formation contained in instances near decision boundaries, and those that may be ignored by the naive Bayes classifier. Finaally, we conduct experiments on 30 data sets from the UC Irvine (UCI) repository. Experiment results show that the SD algorithm can obtain better generali- zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers. Further experiments that that three three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values ​​are selected appropriately.
其他文献
为揭示红树植物在一年内不同时间次生木质部解剖特征的变动对土壤理化因子变化的适应特点,为红树植物的园林应用提供基础资料,本研究以海桑科海桑(Sonneratia caseolaris) 、
随着物流产业快速发展,物流专业人才的需求量不断增加,为了实现资源整合共享,当下各大高校对于专业群的建设步伐也不断加快.在深入贯彻执行当下物流管理专业发展规划的前提下
在科技下乡、科技特派员等工作的基础上,通过问卷调查、实地走访等方式,对福州市科技特派员、涉农企业、个体户、星创天地等进行调查分析,研究表明:福州市现代农业科技创新存
有7年团委书记工作经历的王立宏告诉《教育》旬刊记者:“团委书记应是高校创业教育主要责任人。”王立宏现在是中国创业实习网副总裁兼江苏商骏学院执行院长,目前担任江苏省
乙烯响应因子(ERF)是植物特有的一种转录因子,在植物对环境胁迫的应答方面起到重要的作用。ERF转录因子不仅能通过结合与病程相关皋因启动子中的GCC-box来调节它们的表达从而
在新的历史条件下,如何加强农村基层党风廉政建设,为新农村建设提供有力保障,对于维护和促进农村发展稳定的大局,巩固党在农村的执政基础,提高党的执政能力具有十分重要的意
目的探讨15°床面角度在推送器入口时的应用价值。方法对46例食管镜检病理证实为食管癌的患者行改体位食管支架术。结果行改体位食管支架术的患者,支架推送器经口送入食管的
1.“七一”讲话分几部分?主要讲了几大问题?总书记的讲话,分为三个部分:第一部分是对我党90年历史的总结;第二部分着重谈党的自身建设;第三部分展望未来,将中国特色社会主义
近年来,我校结合自身实际,着力加强党员教师的学习活动制度化、内容科学化、形式多样化建设,有力提升了党组织的学习力、凝聚力和战斗力,有效促进了教育事业的高位、高质发展
2011年7月1日上午,在党的90华诞之际,“为党争光、为民服务——全省远程教育系统五比五争活动颁奖典礼”在省远程教育中心大演播室隆重举行。省委组织部副部长兼老干部局局长