论文部分内容阅读
采用衍生于结构特征和非氢原子连接关系的23维拓扑指数代表链烷烃,并将该指数递交到多元线性回归用于建立临界温度预测模型。为了得到稳定的模型,采用两种方法(M5法与贪心算法)进行变量选择,并采用所得变量子集建立数学模型。所得结果显示:对于最好模型的测试集,相关系数的平方是R~2=0.9924;平均绝对误差是MAE=2.2532K。若将沸点实验值加入链烷烃描述符,则测试集的MAE值显著地降低。基于这些令人满意的结果,我们开展了外延预测研究。将碳原子数n≤9的链烷烃分入训练集,而C_(10)H_(22)(n=10)的同分异构体分入测试集,则所得测试集的MAE=3.6867K。若将八分之一的C_(10)H_(22)的同分异构体加到训练集用于建立模型,并预测余下的C_(10)H_(22)同分异构体,结果显示测试集的MAE值明显下降,即所得到的外推结果令人满意。
A 23-dimensional topological index derived from structural features and nonhydrogen bonding relationships is used to represent paraffins and the index is submitted to multivariate linear regression for establishing a predictive model of the critical temperature. In order to obtain a stable model, two methods (M5 method and greedy algorithm) are used to select the variables and a mathematical model is established by using the obtained variables. The results show that for the best model test set, the square of the correlation coefficient is R ~ 2 = 0.9924; the mean absolute error is MAE = 2.2532K. If the experimental value of the boiling point is added to the paraffin descriptor, the MAE value of the test set is significantly reduced. Based on these satisfactory results, we conducted an epitaxial prediction study. The paraffins with n≤9 were classified into the training set, and the isomers of C_ (10) H_ (22) (n = 10) were assigned to the test set, the MAE of the test set obtained was 3.6867K. If one-eighth of the C_ (10) H_ (22) isomers were added to the training set to model and predict the remaining C_ (10) H_ (22) isomers, The MAE value of the test set decreased significantly, that is, the obtained extrapolation result was satisfactory.