论文部分内容阅读
选取了258个苯酚及其衍生物对水生梨形四膜虫的毒性数据,选择7个分子描述符作为建模的结构参数,开展了以QSTR方法建立苯酚及其衍生物毒性模型的研究。首先,运用稳健诊断方法(Robust Diagnostic Method)剔除奇异样本,然后采用球型排除算法(Sphere-exclusion Algorithms)合理划分样本,继而分别采用多元线性回归(Multiple Linear Regression,MLR)、偏最小二乘(Partial Least Squares,PLS)、BP(Error Back-Propagation,BP)神经网络3种方法进行定量构效关系研究,并对外部验证集采用共识建模方法(Consensus Modeling Method),从而提高了模型的预测能力。研究结果表明,所建模型均具有较好的预测能力和稳定性,且与MLR、PLS模型相比,BP神经网络模型性能略胜一筹,即非线性模型比线性模型性能优越。但是BP神经网络建立的模型不能直接给出直观的数学模型和公式,而MLR、PLS模型更为简单明了。
Toxicity data of 258 phenols and their derivatives to Tetrahymena aquaticus were selected and 7 molecular descriptors were selected as structural parameters for modeling. A QSTR method was developed to study the toxicity of phenol and its derivatives. First, we use the Robust Diagnostic Method to remove the singular samples and then use the Sphere-exclusion Algorithms to reasonably divide the samples, then use multiple linear regression (MLR), partial least squares Partial Least Squares (PLS) and BP (BP Back Propagation) neural networks were used to study the quantitative structure-activity relationship. The Consensus Modeling Method was applied to the external validation set to improve the prediction of the model ability. The results show that all the models have good predictive ability and stability. Compared with MLR and PLS models, the performance of BP neural network model is slightly superior, that is, the nonlinear model has better performance than the linear model. However, the models established by BP neural network can not give intuitive mathematical models and formulas directly, but the MLR and PLS models are more simple and clear.