论文部分内容阅读
随着计算机科学的发展,自然语言处理技术在计算机信息检索系统中的应用越来越广泛。对自然语言处理的研究已经成为信息处理系统中的一个重要课题。语词切分是汉语自然语言处理的第一个阶段。目前,计算机自动分词系统的精度尚不能满足实际需求。本文针对影响切分精度的根本因素——歧义现象,提出了采用神经网络模式识别来消除歧义的方法,以达到提高切分精度的目的。 文中对歧义字段进行了分类,分析了其表现形式和现有的消歧机制,以及歧义切分与模式识别之间的关系,研究了神经网络模式识别方法与歧义切分问题相适应的特点。遵循模式识别的一般步骤,对歧义字段进行特征提取,然后,选用神经网
With the development of computer science, natural language processing technology is more and more widely used in computer information retrieval system. Research on natural language processing has become an important issue in information processing systems. Word segmentation is the first phase of Chinese natural language processing. At present, the accuracy of automatic word segmentation system can not meet the actual needs. In this paper, aiming at the phenomenon of ambiguity, which is the fundamental factor affecting the segmentation accuracy, this paper proposes a method of using neural network pattern recognition to eliminate ambiguities in order to improve segmentation accuracy. In this paper, the disambiguation fields are classified, their manifestations and existing disambiguation mechanisms are analyzed, and the relationship between disambiguation and pattern recognition is analyzed. The characteristics of neural network pattern recognition methods that are compatible with disambiguation problems are also studied. Follow the general steps of pattern recognition, extract the features of the ambiguous fields, and then select the neural network