论文部分内容阅读
癌症的发生发展与机体内基因的改变有密切联系,在临床上表现为症状或检测指标的异常.通过挖掘分析临床表现与基因改变之间的关系,可为癌症早期诊断和精准治疗提供临床决策支持.从文献数据出发,利用结论性数据挖掘基因与临床表现的关系具有重要意义.本文提出一种基于医学主题词(Medical Subject Headings,Me SH)的生物医学实体关系挖掘方法.该方法利用PubMed中提供的文献信息,借用向量空间模型思想,使用MeSH主题词矢量表达待研究实体,引入文献相互引用因素对结果进行修正,将关系挖掘转化为矢量间的数学运算,实现定量分析.本文将该方法应用于结直肠癌临床表现和基因关系的研究中,得到与结直肠癌相关的203个基因和对应的临床-基因462个关系.通过结合使用基因功能和通路分析工具g:Profiler和KEGG等,对结果进行分析验证.结果表明,基于MeSH主题词的文献挖掘方法,避免传统“共现”方法对发现潜在关系的限制和复杂语义分析带来的大量计算,为生物实体之间潜在关系的挖掘提供一种新的思路和方法.
The occurrence and development of cancer are closely linked with the changes of gene in the body, which are clinically manifested as abnormalities of symptoms or detection indexes. The clinical decision-making can be provided for the early diagnosis and accurate treatment of cancer through mining and analyzing the relationship between clinical manifestations and gene changes Support.From the literature data, the use of conclusive data to mine the relationship between genes and clinical manifestations is of great significance.This paper presents a biomedical entity relation mining method based on Medical Subject Headings (Me SH), which uses PubMed , Borrowing the idea of vector space model, using MeSH keywords to express the entities to be researched, introducing references of documents into each other to revise the results, transforming the relational mining into mathematical operations between vectors, and realizing quantitative analysis.In this paper, Methods The study was performed on 203 patients with colorectal cancer and 462 clinical-genes-related genes in the study of clinical manifestations and genetic relationships of colorectal cancer. By using gene function and pathway analysis tools g: Profiler and KEGG , The results were analyzed and verified.The results show that the literature based on MeSH keyword mining Method to avoid the limitations of the traditional “co-occurrence” method in discovering potential relationships and the massive computation caused by complex semantic analysis, which provides a new way of thinking for the potential relationship between biological entities.