An investigation of machine learning based automatic classification of Chinese books

来源 :Journal of Library Science in China | 被引量 : 0次 | 上传用户:lantaiwin
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Faced with the ever-increasing amount of books published,the library cataloging staff feel more and more difficult to classify books manually.How to realize automatic book classification by using computers is a critical question in digital library construction.This paper attempts to introduce the machine learning algorithms,such as BP neural network and Support Vector Machine,into the practice of book classification.A machine learning based hierarchical book classification system model is estabUshed according to the Chinese Library Classification.Related design approaches are proposed to describe a bibliography by using characteristic weighting method and the focus is on the construction of the shallow classification systems.Large-scale experiment is conducted to justify the feasibility and appropriateness of this model.This research basically solves the problem of automatically classifying books without subject indexing. Faced with the ever-increasing amount of books published, the library cataloging staff feel more and more difficult to classify books manually.How to realize automatic book classification by using computers is a critical question in digital library construction.This paper attempts to introduce the machine learning algorithms, such as BP neural network and Support Vector Machine, into the practice of book classification. A machine learning based hierarchical book classification system model is estabUshed according to the Chinese Library Classification. Abstract design approaches are presented to describe a bibliography by using characteristic weighting method and the focus is on the construction of the shallow classification systems. Large-scale experiment is conducted to justify the feasibility and appropriateness of this model. This research substantially solves the problem of automatically classifying books without subject indexing.
其他文献
荟萃名厨调众口食欲,乃人之大欲也。不吃好吃饱,运动员何以有力气参加比赛?所以历次国际运动会的东遭主,都把“吃”当大事来抓,北京亚运会亦不例外。并且抓的井井有条。参加
本刊是中国有色金属工业总公司主办的科技刊物,主要报道稀有金属(钛、钨、钼、稀土、锂、铍、铷、铯、锆、铪、钒、铌、钽、镓、铟、铊、锗、硒、碲、铼等)在选矿、冶炼、分
探讨汉语中外来词的借用问题,简略回顾汉语借用外来词的历史,将其分为三个时期,并讨论其借用的六种方式 To explore the borrowing of foreign words in Chinese, briefly revi
图1所示的套式立铣刀是氮化钛涂层的高速钢铣刀。这种正前角高速钢铣刀的切削性能可与硬质合金刀具相比,故该种刀具也可在传动功率小的机床上使用,其每齿进给量为0.1至0.25
一向以喜剧影片创作蜚声影坛的长影著名导演王凤奎,前不久完成了一部正剧风格的迎接香港回归的九七影视热点影片《世纪约会》的创作。这部影片从策划到拍成经历了两年多的时
本文运用现代文体学理论和普通文体学分析模式,从歌词背景,押韵,修辞,语篇衔接,词汇运用等方面,对Selena Gomez的英文歌曲who says的歌词进行全面的文体分析,从中可以探究英
本文首先分析了云计算环境下数字图书馆虚拟机存在的安全漏洞和威胁,然后提出了全面的虚拟机安全管理方案,该方案能有效提高云计算环境下数字图书馆虚拟机及整个虚拟化系统的
随着人们对工会会计核算职能认识的深化,工会会计事后核算的职能越来越重要。本文试图从事后核算整个过程入手,分析和认识工会会计事后核算职能的重要性和必要性。一、审核
文章通过壮侗语与南岛语。汉语词的比较,认为杜侗语与南岛语是同源关系,而壮侗语与汉语是接触关系。文章联系百越地区的古代文化背景和文字资料认为壮侗语经历了类型的转换。