论文部分内容阅读
潜在语义分析在信息检索领域应用较多,但在近红外光谱领域应用较少。利用近红外漫反射光谱技术,结合潜在语义分析(LSA)和主成分分析(PCA),比较了不同预处理方法、不同奇异值和主成分个数对所建模型的影响,最后确定的模型校正集误判数分别为4和3。用建立的校正模型对验证集进行验证,总的识别率分别达到了96.00%和96.50%。对于功效较近、难以聚类的滋补中药,潜在语义分析是一种新的有效的方法。
Latent semantic analysis is widely used in the field of information retrieval, but it is rarely used in the field of near infrared spectroscopy. The effects of different pretreatment methods, different singular values and the number of principal components on the model were compared by using near-infrared diffuse reflectance spectroscopy, latent semantic analysis (LSA) and principal component analysis (PCA). The final model calibration The set of false positives is 4 and 3, respectively. The validation model was verified with the established calibration model, and the overall recognition rate reached 96.00% and 96.50% respectively. For the more effective, difficult to cluster nutritious Chinese medicine, potential semantic analysis is a new and effective method.