,An Iterative Method for Extracting Chinese Unknown Words

来源 :中国电子杂志(英文版) | 被引量 : 0次 | 上传用户:dtmark
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
An iterative method for extractingunknown words from a Chinese text corpus is pro-posed in this paper. Unlike traditional non-iterativesegmentation-detection approaches, which use onlyknown words for segmentation, the proposed methoditeratively extracts new words and adds them into thelexicon. Then the augmented dictionary, which in-cludes known words and potential unknown words, isused in the next iteration to re-segment the input cor-pus. Experiments show that both the precision andrecall rates of segmentation are improved.
其他文献
In this paper, we study the schedulingalgorithm for multiplexers in CATV networks. Firstwe proposed a concept of scheduling matrix. Andthen two kinds of schedul
A simple formula for estimating thebandpass clutter rank in the case of linear array isderived. Differences between the definition of clutterrank for theoretica
Projective reconstruction is a key step for 3D metric reconstruction from a sequence of images captured by an uncalibrated camera. Due to its inherent robustnes
为分析高空紫外光通信性能,建立了高空太阳辐射分布模型;研究了不同波长紫外光的高空散射系数和吸收系数;考虑太阳辐射的背景光和接收端散粒噪声,对紫外光直视与非直视链路的
This paper describes a new motion es-timation algorithm based on block matching. Makinguse of low bit resolution oriented edge image, the algo-rithm results in
根据线粒体COI基因序列对辽东湾附近海域的镰鲳(Pampus echinogaster)群体(n=28)的遗传多样性进行了研究。在COI序列中,共检测到17个多态性核苷酸位点,定义了12种单倍型,其核
本文通过文献资料法、实验法、逻辑分析法、数据统计法,以业余体校跆拳道学生专项素质训练方法为研究对象;以日照市体校男子乙组为调查对象;在调查以及实验过程中分别对跆拳
本文将毫米波线性调频连续波(Linear frequency modulated continuous wave,LFMCW)雷达用于水面探测,并介绍了水面探测实验及其成像算法。在成像处理中,采用传统的成像算法很
Dynamic Channel Assignment (DCA)together with Adaptive Array Antenna (AAA) takes an important part in cellular mobile communication system. In this paper, a con