Performance comparison between Logistic regression, decision trees, and multilayer perceptron in pre

来源 :中华医学杂志(英文版) | 被引量 : 0次 | 上传用户:crystal19900224
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Background Various methods can be applied to build predictive models for the clinical data with binary outcome variable.This research aims to explore the process of constructing common predictive models,Logistic regression (LR),decision tree (DT) and multilayer perceptron (MLP),as well as focus on specific details when applying the methods mentioned above:what preconditions should be satisfied,how to set parameters of the model,how to screen variables and build accuracy models quickly and efficiently,and how to assess the generalization ability (that is,prediction performance) reliably by Monte Carlo method in the case of small sample size.Methods All the 274 patients (include 137 type 2 diabetes mellitus with diabetic peripheral neuropathy and 137 type 2 diabetes mellitus without diabetic peripheral neuropathy) from the Metabolic Disease Hospital in Tianjin participated in the study.There were 30 variables such as sex,age,glycosylated hemoglobin,etc.On account of small sample size,the classification and regression tree (CART) with the chi-squared automatic interaction detector tree (CHAID) were combined by means of the 100 times 5-7 fold stratified cross-validation to build DT.The MLP was constructed by Schwarz Bayes Criterion to choose the number of hidden layers and hidden layer units,alone with levenberg-marquardt (L-M) optimization algorithm,weight decay and preliminary training method.Subsequently,LR was applied by the best subset method with the Akaike Information Criterion (AIC) to make the best used of information and avoid overfitting.Eventually,a 10 to 100 times 3-10 fold stratified cross-validation method was used to compare the generalization ability of DT,MLP and LR in view of the areas under the receiver operating characteristic (ROC) curves (AUC).Results The AUC of DT,MLP and LR were 0.8863,0.8536 and 0.8802,respectively.As the larger the AUC of a specific prediction model is,the higher diagnostic ability presents,MLP performed optimally,and then followed by LR and DT in terms of 10-100 times 2-10 fold stratified cross-validation in our study.Neural network model is a preferred option for the data.However,the best subset of multiple LR would be a better choice in view of efficiency and accuracy.Conclusion When dealing with data from small size sample,multiple independent variables and a dichotomous outcome variable,more strategies and statistical techniques (such as AIC criteria,L-M optimization algorithm,the best subset,etc.) should be considered to build a forecast model and some available methods (such as cross-validation,AUC,etc.) could be used for evaluation.
其他文献
目的 探讨小切口非超声乳化白内障摘除联合人工晶状体植入术的临床疗效.方法 对白内障患者157例157眼,术前充分散瞳,在手术显微镜下完成手术,术后1周、6个月进行视力检测.结
以蓝山湘江源至高塘坪公路段为工程背景,构建路基堆载土高边坡的局部稳定性状态分析模式。在借助极限分析方法获得边坡可靠度分析目标函数的基础上,结合 Monte-Carlo 法对该破
Background Given that three-dimensional finite element models have been successfully used to analyze biomechanics in orthopedics-related research,this study ai
目的探讨老年2型糖尿病及其合并脑梗死时血尿酸水平。方法选取32例健康老人,35例老年2型糖尿病患者,33例老年2型糖尿病合并脑梗死患者进行血尿酸水平测定,并分组进行分析。结
目的探讨单病种病例的选择标准。方法以本院的单病种管理为例,分析不同选择标准情况下的单病种数据。结果单病种病例选择对单病种管理(特别是公示单病种)影响很大。结论在权
盐酸氮(艹卓)斯汀(azelastine hydrochloride)是较新的第二代抗组胺药,剂型分为鼻喷剂(如德国爱斯达药厂生产的爱赛平)和口服剂.
目的 研究152例外部性脑积水(external hydrocephalus,EH)患儿远期表现.方法对152例EH进行回顾性分析.结果 114例痊愈(75%),38例留有后遗症(25%).结论 EH应及早干预治疗,否则
南海之滨,春风浩荡.2018年4月13日,在海口的碧海蓝天下,中国医疗保健国际交流促进会(简称医促会)骨科分会创伤学组换届会议热烈召开.rn在本刊编辑部主任、南方医科大学南方医
期刊
Background Bariatric surgery offers successful resolution of type 2 diabetes mellitus (T2DM).However,recurrence of T2DM has been observed in a number of patient