论文部分内容阅读
目的为解决全国医疗机构法定传染病报告质量调查过程中现场调查数据与传染病网络直报系统记录匹配问题,采用概率数据匹配方法对不同来源的信息进行匹配。方法采用改良的Fellegi-Sunter概率数据匹配方法,对匹配项系数进行赋值,分别计算每一配对记录之间相似性得分,若匹配相似性得分超过一定的阈值(cut-off值)后,即认为匹配成功。对自动匹配结果进行人工核对,并作为金标准,对自动匹配结果进行评价。结果将调查过程中获取的2153条原始记录与网络直报系统中97 271张传染病报告卡信息进行分层多维度概率匹配。以总得分25分作为阈值,将自动匹配结果与人工判断结果比较。结果显示,自动匹配的灵敏度为98.96%(95%CI:98.39%~99.36%),特异度为94.92%(95%CI:91.29%~97.35%),总一致率为98.51%(95%CI:97.91%~98.98%),Kappa值为0.9250,ROC曲线下面积为0.9979。结论分层多维度概率匹配方法成功解决了现场调查的原始数据与网络报告系统的数据匹配问题,匹配结果与实际情况具有较高的一致性,显著提高了工作效率,也为今后开展类似工作提供简易的分析工具。
OBJECTIVE To solve the problem of field investigation data and network record system of infectious diseases during the survey of the quality of notifiable infectious diseases in medical institutions nationwide, the matching of information from different sources was carried out by using the matching method of probability data. Methods A modified Fellegi-Sunter probability data matching method was used to assign the coefficients of matching items. The similarity scores of each matching record were calculated respectively. If the matching similarity score exceeded a certain cut-off value, Matched successfully. Automatically match the results of manual verification, and as the gold standard, automatic matching results were evaluated. Results The 2153 original records obtained during the investigation and the 97 271 infectious disease report card information in the network direct filing system were subjected to hierarchical multi-dimensional probability matching. With a total score of 25 points as a threshold, the result of automatic matching is compared with the result of manual judgment. The results showed that the sensitivity of automatic matching was 98.96% (95% CI: 98.39% -99.36%) and the specificity was 94.92% (95% CI: 91.29% -97.35%). The overall agreement rate was 98.51% (95% CI: 97.91% ~ 98.98%), Kappa value was 0.9250 and the area under the ROC curve was 0.9979. Conclusions The hierarchical multi-dimensional probability matching method successfully solves the problem of data matching between the original data and the network reporting system. The matching results are highly consistent with the actual situation, significantly improving work efficiency and providing similar services for future work Simple analysis tools.