论文部分内容阅读
土地分类法是土地资源合理利用和规划研究中首先要解决的一个问题.在以往的土地分类研究中,一般是以“地块”为土地单元,“地块样本”是分类的对象,其数据是以地貌部位、利用现状、坡度、坡向、土壤种类、土壤侵蚀类型等众多指标因子(定性因子居多数)来表现某一特定“地块”的特征.按各个“地块”在不同指标因子上的不同“表现”,分别用数字1,2,3,…等将其“量化”,然后用普通多元统计分析中的距离度量方法,以欧氏距离、最大(最小)距离、相似系数…等间隔尺度的统计量,来衡量地块样本之间“相近”或“相离”的尺度,采用动态聚类或系统聚类等方法,对“地块”进行分类.这里值得注意的是:地块样本的数据具有定性、高维、离散的特点,尽管可以进行“量化”处理,以数字1,2,3,…等来表示,但这些数字只是用来表示不同程度的“标志”而已,它本身并没有任何“量”的含义,更没有“大”与“小”的区别.那么对这样“量化”的数据采用普通多元统计分析中定义距离的方法,显然很不妥当,因为这样计算得到的距离,并不能真正反映出地块样本之间的实际“距离”.因此有必要重新认识“地块样本”之间的“距离”,正确而客观地对其进行科学的分类.本文依据地块样本数据定性、高维、离?
Land classification is one of the most important problems to be solved in the rational utilization and planning of land resources. In the past studies of land classification, “plots” were usually used as land units. “Plots samples” were the objects of classification. The data were based on the topography, current status, slope, aspect, soil types, soil erosion types And many other indicators of factors (the majority of qualitative factors) to show the characteristics of a particular “plots.” According to the different “performance” of each “plot” on different index factors, they are quantified by the numbers 1, 2, 3, ..., respectively, and then are measured by the distance measure in the ordinary multivariate statistical analysis. , The maximum (minimum) distance, the similarity coefficient, etc., are used to measure the scale of “close” or “away from each other” between the samples of the parcel. Dynamic clustering or systematic clustering is used to measure the “ ”sort. What should be noted here is that the data of plots samples are qualitative, high-dimensional and discrete. Although they can be “quantified”, they are represented by numbers 1, 2, 3, ..., but these numbers are only used to indicate different The “mark” of the degree only, it itself does not have any “quantity” meaning, but also no “big” and “small” difference. It is obviously not proper to adopt the method of defining the distance in ordinary multivariate statistical analysis for such “quantified” data because the calculated distances do not really reflect the actual “distance” between the samples of the plots. Therefore, it is necessary to re-understand the “distance” between “parcel samples” and categorize them correctly and objectively. This article based on the sample of land samples qualitative, high-dimensional, from?