论文部分内容阅读
[目的]使用随机森林对职业健康监护数据和人群焦虑情况进行分析,探讨数据挖掘方法的应用。[方法]收集某企业职业健康监护数据,并使用GAD-7广泛性焦虑量表进行问卷调查,然后用随机森林对职业健康监护数据以焦虑情况为结局变量进行分类。[结果]随机森林对焦虑情况的分类效果较好,焦虑高分组错分率为14.62%,焦虑低分组错分率为5.95%,袋外数据误差率估计为10.27%。[结论]将职业健康监护数据与随机森林相结合,能够为焦虑人群的早期发现、筛选和干预提供帮助,为职业健康监护数据的利用提供新思路。
[Objective] To analyze the data of occupational health monitoring and the crowd anxiety using random forest to explore the application of data mining methods. [Methods] Collect occupational health monitoring data of a certain enterprise and carry out questionnaire survey using GAD-7 generalized anxiety scale. Then use random forest to classify occupational health monitoring data with anxiety as outcome variable. [Results] The classification of anxiety was better in random forest. The misclassification rate was 14.62% in anxiety high group, 5.95% in poor anxiety group, and 10.27% in extra-bag data. [Conclusion] The combination of occupational health monitoring data and random forest can provide help for the early detection, screening and intervention of anxiety population, and provide new ideas for the utilization of occupational health monitoring data.