FOLFOX方案一线治疗转移性结直肠癌的人工神经网络疗效预测模型构建

来源 :中华肿瘤杂志 | 被引量 : 0次 | 上传用户:sky_fly_sk
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
目的:构建预测FOLFOX方案(奥沙利铂+亚叶酸钙+氟尿嘧啶)一线治疗转移性结直肠癌(mCRC)疗效的人工神经网络模型。方法:从美国国家生物信息中心的基因表达汇编(GEO)数据库获取1组FOLFOX方案一线治疗mCRC的数据(GSE104645)作为训练集,其中FOLFOX方案疗效敏感组(完全缓解和部分缓解患者)31例,耐药组(疾病稳定和疾病进展患者)23例。将训练集样本按7∶3分为内部训练样本和内部测试样本。以福建医科大学附属协和医院结直肠外科采用FOLFOX方案一线治疗的30例mCRC患者的芯片数据集(GSE69657)作为外部验证样本,其中敏感组13例,耐药组17例。运用R 3.5.1软件Combat包对两套矩阵的表达值进行批间差校正。运用GEO2R平台对GSE104645中敏感组和耐药组的基因表达进行差异分析,以n P0.33(FC为差异倍数)为阈值,筛选FOLFOX方案的耐药基因和敏感基因。采用多层感知器方法对GSE104645数据集进行FOLFOX方案疗效的人工神经网络模型构建。然后,以外部验证样本进行回代验证。采用受试者工作特征(ROC)曲线对所建模型的预测能力进行评价。n 结果:基于GSE104645数据集,共筛选出2 076个差异基因,其中822个基因在耐药组上调,1 254个基因下调,下调基因为敏感基因。基因本体论(GO)分析显示,差异基因主要富集在物质代谢的调控过程中。所构建的人工神经网络模型共纳入39个基因,包含2个隐藏层。其在训练集中预测训练样本和测试样本的准确度分别为75.7%和76.5%,ROC曲线下面积为0.875。回代验证显示,外部验证样本的ROC曲线下面积为0.778。结论:基于芯片数据成功建立了人工神经网络预测模型,模型稳定性好,预测FOLFOX方案一线治疗mCRC疗效的效能强。与奥沙利铂耐药相关的基因功能主要富集在物质代谢的调控过程中。“,”Objective:To explore and establish an artificial neural network (ANN) model for predicting the efficacy of first-line FOLFOX chemotherapy for metastatic colorectal cancer.Methods:A set of FOLFOX chemotherapy data from a group of patients with metastatic colorectal cancer (mCRC) (GSE104645) was downloaded from the GEO database as a training set. According to the FOLFOX protocol, the efficacy was divided into two groups: the chemo-sensitive group (including complete response and partial response) and the chemo-resistant group (including stable disease and progressive disease), including 31 cases in the sensitive group and 23 in the resistant group. Then, chip data (accessible number: GSE69657) from Fujian Medical University Union Hospital were chosen as a test set. A total of 30 patients were enrolled in the study, including 13 in the sensitive group and 17 in the resistant group. The batch effect correction was performed on the expression values of the two sets of matrices using the R 3.5.1 software Combat package. The gene expression difference of sensitive and resistant group in GSE104645 was analyzed by the GEO2R platform. n P0.33 (FC abbreviation of fold change) were used as the threshold value to screen the drug resistance and sensitive genes of the FOLFOX regimen. An ANN was constructed using the multi-layer perceptron (MLP) to perform the FOLFOX regimen on the GSE104645 dataset. The GSE69657 expression matrix and clinical efficacy parameters were then used for retrospective verification. Receiver operating characteristic(ROC) curves were used to evaluate the test results and predictive power.n Results:A total of 2, 076 differentially expressed genes in GSE104645 were selected, of which 822 genes were up-regulated and 1, 254 genes were down-regulated in the chemo-resistance group. The down-regulated genes were sensitive genes. GO analysis of the biological processes in which the differentially expressed genes were involved, revealed that they were mainly involved in the regulation of substance metabolism. A total of 39 genes were included in the final model construction. This was a neural network model with two hidden layers. The accuracy of predicting training samples and test samples was 75.7% and 76.5%, respectively, and the area under the ROC curve was 0.875. The chip data set of our department (GSE69657) was set as the test set, and the area under the ROC curve was 0.778.Conclusions:In this study, an artificial neural network model is successfully constructed to predict the efficacy of first-line FOLFOX regimen for metastatic colorectal cancer based on the microarray, and an independent external verification is also conducted. The model has good stability and well prediction efficiency. Besides, the results of this study suggest that the gene functions related to oxaliplatin resistance are mainly enriched in the regulation process of substance metabolism.
其他文献
目的:了解贵阳市城市居民健康素养的现状,为制定和评价健康教育政策、提高居民健康素养水平提供参考依据.方法:在贵阳市随机选取6个城区共1193名居民进行面对面访谈.结果:本次调
目的:了解2020年中国肿瘤药物临床试验的进展及上市肿瘤药物情况,探讨新型冠状病毒肺炎(COVID-19)对肿瘤药物临床试验进展的影响。方法:从中国国家食品药品监督管理总局药物临
目的:了解上海市金山区农村居民健康素养水平,评价现行的健康教育措施干预效果,为今后健康教育政策的制定提供依据.方法:分别于2008年和2010年抽取上海市金山区长浜村14.42%展健
会议
随着中国平安“精英大学生励志计划”青年保险学术论文奖在各高校、保险院系的启动,在校园学术创新的沃土上,增加了“中国平安”积极奖励保险学术创新、扶持保险理论发展的
目的:了解宁波市小学生吸烟行为现状及其影响因素,为开展相关控烟工作提供科学依据.方法:采用分层整群抽样法抽取宁波市城市、城郊和农村11所小学4336名学生进行吸烟行为问卷调
会议
目的:观察按揉合谷和三阴交对无痛分娩产程中宫缩乏力的影响.方法:应用SPSS编程法将100例无痛分娩产程中宫缩乏力的产妇随机分为穴位组和药物组,每组50例.穴位组按揉双侧合谷
目的:了解温州市鹿城区居民健康素养现状及影响因素,为开展相关健康促进与健康教育工作提供依据.方法:通过多阶段随机抽样方法,对鹿城区440名居民进行问卷调查.结果:鹿城区居民
会议
蔬菜的保护地栽培是解决蔬菜周年供应的一个重要措施。随着国民经济的迅速发展、人民生活水平的不断提高,人们对蔬菜的周年供应提出了越来越多的要求,不但要满足数量、提高质量
目的:了解上海市闵行区农村地区成人健康素养水平,为进一步调整公共卫生政策、合理有效地分配卫生资源、开展健康素养干预活动提供依据.方法:抽取闵行区334名15~69岁的常住居民