论文部分内容阅读
探索语言的普遍特征一直是语言学研究的重要内容,当前依存距离最小化已经被证实是人类语言的一种普遍规律。为了发现这一规律背后的动因,对30种语言的依存距离分布情况进行研究,通过多种模型拟合对比,发现广延指数分布和指数截断的幂律分布分别适合拟合“短句”与“长句”的依存距离分布。研究结果还显示,人类语言的依存距离分布介于指数分布和幂律分布之间,可用指数和幂律混合的模型来描述。在此基础上,利用不同模型拟合对比来探讨依存距离分布的方法和路径,结果揭示出人类语言的依存距离可能遵循一种普遍性的分布模式,反映了省力原则和人类认知机制在语言结构运用与演化过程中发挥着重要的支配作用。
Exploring the general characteristics of language has always been an important part of linguistic research. The current minimization of dependence distance has been proved to be a universal law of human language. In order to find out the motivation behind this rule, we study the dependency distance distribution in 30 languages, and compare and match various models to find that the power law distribution of extended exponential distribution and exponential truncation are respectively fit to “” “And” long sentence "the dependence of distance distribution. The results also show that the interdependent distance distribution of human language lies between the exponential distribution and the power law distribution, and can be described by the mixed exponential and power law models. Based on this, the method and route of dependence distance distribution are explored by comparing and fitting different models. The results reveal that the dependence distance of human language may follow a universal pattern of distribution, reflecting the principle of labor-saving and human cognitive mechanism in language Structural application and evolution play an important dominating role.