论文部分内容阅读
为解决Webshell检测特征覆盖不全、检测算法有待完善的问题,提出一种基于随机森林改进算法的Webshell检测方法。首先对三种类型的Webshell进行深入特征分析,构建多维特征,较全面的覆盖静态属性和动态行为,改进随机森林特征选取方法,依据Fisher比度量特征重要性,对子类的依赖特征进行划分,按比例和顺序从中选择特征,克服特征选择完全随机带来的弊端,提高决策树分类强度,降低树间相关度。实验对随机森林改进算法和标准算法进行了对比分析,结果表明改进算法依靠更少的决策树就能达到很好效果,并进一步与SVM算法进行比较,证明该方法提高了Webshell检测的效率和准确率。
In order to solve the problem that Webshell detection feature is incomplete and detection algorithm needs to be improved, a Webshell detection method based on stochastic forest improvement algorithm is proposed. Firstly, we deeply analyze the three types of Webshell, construct multidimensional features, cover the static attributes and dynamic behaviors more comprehensively, improve the method of selecting random forest features, measure the feature importance according to Fisher’s ratio, divide the dependency characteristics of sub-categories, Choose features in proportion and order, overcome the random defects brought by feature selection, improve the classification strength of decision trees, and reduce the correlation between trees. Experimental results show that the improved algorithm can achieve good results by using fewer decision trees and further compare with the SVM algorithm. The results show that the proposed method improves the efficiency and accuracy of Webshell detection rate.