论文部分内容阅读
针对物性参数和近红外光谱数据之间的回归模型的建立问题,基于建立一系列回归器的思想,给出了1种用于多变量校正的Boosting-PLS算法。每个(弱/基本)回归器均建立于原校正集的1个子集上,每个子集均通过原校正集带概率重复采样的方式得到,而样本的概率则由前1个回归器的预测误差确定。大误差的样本将增大概率,以便后续的回归器更集中地对其进行训练。最终的集成回归模型则为弱回归器的加权取中值。通过1个近红外应用实例和与偏最小二乘的比较,证实了Boosting-PLS算法的优良性能,所建校正模型更精确、更稳健,对过拟合不敏感。
Based on the idea of establishing a series of regressors, a Boosting-PLS algorithm for multivariable calibration is proposed for the regression model between physical parameters and near-infrared spectral data. Each (weak / basic) regression is based on a subset of the original calibration set, and each subset is obtained by means of resampling with the original calibration set with probability, while the probability of the sample is predicted by the previous regression The error is determined. Large errors in the sample will increase the probability of follow-up of the returner to more focused on its training. The final integrated regression model is the Weighted Regression Weighted median. The performance of Boosting-PLS algorithm is verified by a comparison between NIR application and partial least squares. The proposed calibration model is more accurate, robust and insensitive to over-fitting.