论文部分内容阅读
Spectrum-based fault localization(SBFL)is one of the most effective fault localization techniques.It uses different formulas called the risk evaluation formula to pinpoint faults in the debugging process.The used formulas and program spectra may limit the accuracy of a specific SBFL method.This causes the testers to check many non-faulty statements before discovering the real faulty ones,consequently affecting the efficiency of fault localization.The empirical and theoretical studies show that combining these formulas can potentially optimize SBFL’s performance.To address this problem,this thesis proposed two different methods for fault localization.Our methods enhance the accuracy of spectrum-based fault localization.The main work in this thesis are summarized as follows:(1)This thesis first defined four metrics that can become essential components of a ranking formula using the program spectrum to mitigate spectrum-based fault localization problems.These metrics are combined to propose a new heuristic,Metrics Combination(MECO),which doesn’t require any prior information on program structure or semantics to locate faults effectively.The evaluation experiments are conducted on Defects4 J and SIR datasets,and MECO is compared with the 18 maximal formulas.The experimental result shows that MECO is more efficient in terms of precision,accuracy,and wasted efforts than the compared formulas.An empirical evaluation also indicates that two of the defined metrics,Assumption Proportion and Fault Assumption,when combined with the existing formulas,improved the localization effectiveness,especially the precision of ER5 a,ER5b & ER5c(77.77%),GP02(41%),and GP19(27.22%),respectively.(2)It further explored and empirically evaluated the possible combination of two different risk evaluation formulas using two different methods for precise fault localization.A comparison of the performance of combined risk evaluation formulas against standalone formulas was made using 92 faults from SIR and 357 faults from Defects4 J repositories.This thesis highlights what and which risk evaluation formulas to combine to maximize fault the efficiency and accuracy of fault localization.The experiment results show that the non-linear combination method is more effective than the linear combination method and negatively correlated formulas are better combined than positively correlated formulas,especially those with a negligent correlation.The study concludes that combining two different methods of the same fault localization technique via machine learning models can plausibly optimize the accuracy of fault localization.But for better accuracy,the developers should combine the methods that do not correlate and have a negative sign of correlation.