论文部分内容阅读
An iterative method for extractingunknown words from a Chinese text corpus is pro-posed in this paper. Unlike traditional non-iterativesegmentation-detection approaches, which use onlyknown words for segmentation, the proposed methoditeratively extracts new words and adds them into thelexicon. Then the augmented dictionary, which in-cludes known words and potential unknown words, isused in the next iteration to re-segment the input cor-pus. Experiments show that both the precision andrecall rates of segmentation are improved.