论文部分内容阅读
偏最小二乘算法(Partial least squares,PLS)可以很好地解决分析数据中的变量共线性问题,在光谱分析,尤其是近/中红外及拉曼光谱的定量分析中应用广泛。针对PLS存在的有效信息提取和噪声抑制问题,提出一种变量聚类重加权的PLS算法。通过对光谱的各波数变量进行聚类并分别建模,然后集成为全谱模型。通过对计算并赋予各子类不同的权重,根据对模型的贡献对变量进行重加权,从而提高算法的预测精度。汽油中的辛烷值预测和烟草中的烟碱含量预测两组近红外数据验证表明,所提出算法优于经典的PLS算法,其RMSEP在两组数据中分别降低32%和22%,在光谱数据的定量分析中具有潜在的应用优势。
Partial least squares (PLS) can solve the problem of collinearity in the analysis data well and is widely used in the quantitative analysis of spectrum analysis, especially near / mid-infrared and Raman spectroscopy. Aiming at the problem of effective information extraction and noise suppression in PLS, a PLS algorithm based on variable clustering weighting is proposed. By clustering and modeling each wavenumber variable of the spectrum, they are integrated into a full-spectrum model. By calculating and assigning different weights to each subclass, the variables are re-weighted according to their contribution to the model so as to improve the prediction accuracy of the algorithm. The prediction of octane number in gasoline and nicotine content in tobacco showed that the proposed algorithm is superior to the classical PLS algorithm in that the RMSEP is reduced by 32% and 22% respectively in the two sets of data. In the spectrum Quantitative data analysis has potential application advantages.