DP-Share: Privacy-Preserving Software Defect Prediction Model Sharing Through Differential Privacy

来源 :计算机科学技术学报(英文版) | 被引量 : 0次 | 上传用户:yuyuspecialshow
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
In current software defect prediction (SDP) research, most previous empirical studies only use datasets provided by PROMISE repository and this may cause a threat to the extal validity of previous empirical results. Instead of SDP dataset sharing, SDP model sharing is a potential solution to alleviate this problem and can encourage researchers in the research community and practitioners in the industrial community to share more models. However, directly sharing models may result in privacy disclosure, such as model inversion attack. To the best of our knowledge, we are the first to apply differential privacy (DP) to privacy-preserving SDP model sharing and then propose a novel method DP-Share, since DP mechanisms can prevent this attack when the privacy budget is carefully selected. In particular, DP-Share first performs data preprocessing for the dataset, such as over-sampling for minority instances (i.e., defective modules) and conducting discretization for continuous features to optimize privacy budget allocation. Then, it uses a novel sampling strategy to create a set of training sets. Finally it constructs decision trees based on these training sets and these decision trees can form a random forest (i.e., model). The last phase of DP-Share uses Laplace and exponential mechanisms to satisfy the requirements of DP. In our empirical studies, we choose nine experimental subjects from real software projects. Then, we use AUC (area under ROC curve) as the performance measure and holdout as our model validation technique. After privacy and utility analysis, we find that DP-Share can achieve better performance than a baseline method DF-Enhance in most cases when using the same privacy budget. Moreover, we also provide guidelines to effectively use our proposed method. Our work attempts to fill the research gap in terms of differential privacy for SDP, which can encourage researchers and practitioners to share more SDP models and then effectively advance the state of the art of SDP.
其他文献
护患关系是指护士与患者在医疗护理等活动中自然形成的一种帮助与被帮助的关系,不仅局限于护患之间而是一种多方面、多层次的专业性互动关系,同时还是一种治疗性关系.其中以
应用TDP(Thermal Dissipation Probe)技术对大青山古路板林场的30a生油松人工林树干液流以及不同林分密度下的树木蒸腾耗水规律进行了研究。结果表明:1)在生长季内,树干径向
植物种群的生态位宽度与种群之间的生态位相似性比例及生态位重叠度反映了森林的更新生态特征。采用定量分析方法对子午岭油松林主要种群的更新生态位进行分析研究,结果表明:
期刊
为有效预测预报小麦条锈病,采用跨平台动态网页语言JAVA和数据库管理系统MySQL,以近30年积累资料构建的模型为基础,结合专家经验,建立了陇东、陇南和关中地区小麦条锈病远程
Defect prediction assists the rational allocation of testing resources by detecting the potentially defective software modules before releasing products. When a
外周原始神经外胚层肿瘤 ,是一组少见的可能起源于中枢和交感神经系统、外周神经的小圆细胞恶性肿瘤。我们遇到 1例 ,现报告如下。患者 女 ,19岁。半年前右胸壁发现一核桃大
期刊
目的以羟基磷灰石-壳聚糖(HA-CHIT)纳米复合物修饰玻碳电极,磺胺甲噁唑抗体为探针,探讨一种新型的电化学免疫传感器检测磺胺甲噁唑(SMX)。方法用滴涂法将HA-CHIT纳米复合物修
在简单介绍角规抽样生长量估计方法的基础上,对各种方法进行了分析比较,从而指出了参考文献[1]所存在的问题,并提出综合法应是最优的生长量估计方法。 Based on a brief int