论文部分内容阅读
词性标注是语言研究者进行句法分析和其他研究的基础,其划分是否得当直接影响着语料库的下一步建设。本文从句法分析实际操作的角度对国内几个常用分词系统的词性标记问题进行了对比分析,着重探讨了其中一些标记给句法标注带来的问题,如习用语和简称、前接成分和后接成分。针对这些问题,本文从实用的角度,在参考多方建议的基础上,提出了相应的标注策略。
Part-of-speech tagging is the basis for linguistic researchers’ syntactic analysis and other studies. Whether the classification is appropriate or not directly affects the construction of the next corpus. In this paper, we analyze the part-of-speech tagging problems of several commonly used word-segmentation systems from the perspective of syntactic analysis and practice, highlighting some of the problems caused by tagging, such as idioms and abbreviations, ingredient. In response to these problems, this article from the practical point of view, based on the reference to multi-party proposal, put forward the corresponding labeling strategy.