Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets

来源 :Tsinghua Science and Technology | 被引量 : 0次 | 上传用户：gny637259

【摘要】

：

There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rul

【出处】

：

Tsinghua Science and Technology

【发表日期】

：

2012年06期

【关键词】

：

classifier seriously recognize benchmark imbalance avoiding minority Michigan ge

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

There have been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will seriously underestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails. This paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples; thus avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets. There are been many skewed cancer gene expression datasets in the post-genomic era. Extraction of differential expression genes or construction of decision rules using these skewed datasets by traditional algorithms will can unde undestimate the performance of the minority class, leading to inaccurate diagnosis in clinical trails The paper presents a skewed gene selection algorithm that introduces a weighted metric into the gene selection procedure that. The extracted genes are paired as decision rules to distinguish both classes, with these decision rules then integrated into an ensemble learning framework by majority voting to recognize test examples and thus avoiding avoiding tedious data normalization and classifier construction. The mining and integrating of a few reliable decision rules gave higher or at least comparable classification performance than many traditional class imbalance learning algorithms on four benchmark imbalanced cancer gene expression datasets.

其他文献

无水“灌溉”

植物生长离不开水。那么能不能设法促使农作物更好地保存本身已有的水分呢?为此,苏联柯斯嘉戈夫水利工程与土壤改良科研所设计出一种无水“灌溉”方法。他们在楚伊谷地专门

期刊

糖用甜菜土壤改良石灰粉反射作用收获量曾志

构建家庭影院基本法

家庭影院的音响系统配置分几个方面,入门级的即是最简单的杜比定向逻辑环绕模式,这种模式除了杜比环绕外,DSP模式也很少,此时环绕音箱按杜比定向逻辑环绕模式7kHz上限频率选

期刊

家庭影院杜比定向逻辑音响系统环绕音箱功放音箱杜比环绕环绕声声道功率

蛇医季德胜传奇(连载之三)

远诊武汉这是一次老蛇医专家季德胜终生难忘的医治蛇伤出门去武汉的远诊。那是1960年8月28日。北京给南通市政府来电:中央卫生部电报指令,请蛇医专家季德胜速赴武汉空军医院

期刊

季德胜劳动锻炼毒蛇咬伤公社卫生院蛇伤终生难忘阶级敌人上海报反右运动反右派

银色旋风——惠普DT动感系列家庭影院

初次看到惠普今年新推出的DT动感系列音箱时,令人有耳目一新的感觉,小旋风的外形也开始“流行”起来了,整套音箱以2000年流行的银灰色为基调,所有音箱的外形设计均以银灰色

期刊

音箱动感旋风家庭影院惠普喇叭单元视觉冲击面板DT银色

中国特色新型城镇化建设中的环境保护工作

以城镇资源环境承载力的约束为前提,对城镇化建设指导思想从“在发展中保护,在保护中发展”的角度进行全面设计,转变产业结构和产业布局,建设资源节约型、环境友好型社会的新

会议

中国特色城镇化建设绿色环境友好型社会资源节约型内容和途径环境承载力资源使用

桃树密植增产显著

河北省国营柏各庄农场林业局技术员孟令京,从1981年开始,在0.8亩地里栽植了69株大久保桃树苗,采用2×4米株行距的密植方法,经过科学管理,实现了两年见果,3—4年丰产。去年平

期刊

大久保桃鲜桃平均单果重结果枝组长果枝春季修剪唐海县夏季修剪花芽形成花束状果枝

西瓜的二次结瓜

在一次瓜成熟前,在每一植株上选一条有雌花的蔓加以标记,并加强管理和追肥。一次瓜收获后立即清除田间杂草,摘除老、弱、病叶,除 Before a melon matures, select a plant

期刊

病叶茎蔓天后

江川县畜禽污染分析与防治对策

2000年以来,江川县畜牧业发展过快,导致畜禽粪便污染物过多,严重污染星云湖流域水生生态系,造成星云湖水体污染加重.本文分析2000年以来江川县的畜禽污染变化趋势,探讨畜禽粪

会议

江川县畜禽粪便污染污染分析星云湖流域畜牧业发展水生生态系畜禽污染水体污染

介绍塑料育苗方格盒

采用营养钵育苗的最大缺点,是每年要花很多的劳力来制造土块或装配营养土。同时,在育苗之前,也要花很多劳力来铺摆营养钵、纸袋等;而且在栽苗时,又不得不一个一个的将苗挖起

期刊

模格配制比例配量分苗木制品大洛五十年代浅盘次取

浅谈农村面源污染危害及管理指标体系的建立

农村面源污染问题已引起国家和民众的关注,在农村环境中农药和化肥的不合理施用、养殖业粪便及生活污水随意排放是造成农村面源污染的主要来源.根据农村面源污染产生途径及污

会议

农村面源污染污染危害管理农村环境保护指标体系污染现状污染问题生活污水

Mining and Integrating Reliable Decision Rules for Imbalanced Cancer Gene Expression Data Sets

与本文相关的学术论文