SHEsis PCA: A GPU-Based Software to Correct for Population Stratification that Efficiently Accelerat

来源 :Journal of Genetics and Genomics | 被引量 : 0次 | 上传用户:youdong1964
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Population stratification is a problem in genetic association studies because it is likely to highlight loci that underlie the population structure rather than disease-related loci. At present, principal component analysis(PCA) has been proven to be an effective way to correct for population stratification. However, the conventional PCA algorithm is time-consuming when dealing with large datasets. We developed a Graphic processing unit(GPU)-based PCA software named SHEsis PCA(http://analysis.bio-x.cn/SHEsis Main.htm) that is highly parallel with a highest speedup greater than 100 compared with its CPU version. A cluster algorithm based on X-means was also implemented as a way to detect population subgroups and to obtain matched cases and controls in order to reduce the genomic inflation and increase the power. A study of both simulated and real datasets showed that SHEsis PCA ran at an extremely high speed while the accuracy was hardly reduced. Therefore, SHEsis PCA can help correct for population stratification much more efficiently than the conventional CPU-based algorithms. Population presentations, principal component analysis (PCA) has been proven to be an effective way to correct for population stratification However, the conventional PCA algorithm is time-consuming when dealing with large datasets. We developed a Graphic processing unit (GPU) -based PCA software named SHEsis PCA (http://analysis.bio-x.cn/SHEsis Main.htm ) that is highly parallel with a highest speedup greater than 100 compared with its CPU version. A cluster algorithm based on X-means was also implemented as a way to detect population subgroups and to obtain matched cases and controls in order to reduce the genomic inflation and increase the power. A study of both simulated and real datasets showed that SHEsis PCA ran at an extremely high speed while the accuracy was hardly reduced. Thus, SHEsis PCA can help correct fo r population stratification much more than the conventional conventional CPU-based algorithms.
其他文献
2004年9月2日,巴陵石化职工影剧院。候场室里,阿辉不安地来回走动。中国石化第一届职工文艺汇演中南片的比赛正紧张激烈地进行。阿辉没有想到,水平会如此的接近,争夺会如此的
Viruses are a cause of significant health problem world-wide, especially in the developing nations. Due to different anthropological activities, human populatio
目的:研究单纯性肥胖少儿血ET、血脂及血压的改变及耐力运动对其的影响。方法:将受试对象分为3组,肥胖组(21人)、对照组(18人)和运动组(7人),比较两组间及运动组运动前后血ET、血脂和血压的改变,并
为了选择能反映脑瘫和运动发育迟缓患儿运动功能治疗效果的测量方法,作者对粗大运动功能测量(Gross Motor Function Measure,GMFM)和婴幼儿粗大运动发育量表(Peabody Develo
如何搞好基层地税单位的精神文明建设,我们认为应注意以下三个方面:一、以人为本,全面提高队伍的政治业务素质一是建立一支由主管局长挂帅,各所、分局负责同志组成的政治工
目的分析儿童溃疡性结肠炎(UC)的临床特征及其诊断。方法对40例儿童UC的内镜表现、组织学特点以及临床特征进行分析。结果儿童UC占结肠镜检查总数的4%,年龄2个月~12岁,最小发病年龄生后7 d,中度+重度占75 %(18+12/40),临床表现慢性腹泻68 % (27/40),粘液或粘液血便53 %(21/40),反复便血48% (19/40),腹痛、呕吐等不明显,营养不良30 %(12/40)
近些年来,我国面临深层次矛盾凸显、经济下行压力加大的困境,在困难和压力面前,政府出台一系列财政政策扶持中小企业发展。本文通过研究中部地区30家在中小企业板和创业板上
目的:构建小鼠PICK1基因的真核表达载体myc-PICK1.方法:从小鼠脑组织中提取总RNA,RT-PCR扩增PICK1序列,所得片段及真核表达载体pRK5-myc用SalⅠ和NotⅠ双酶切,后连入pRK5-myc载体中.重组质粒经SalⅠ和NotⅠ双酶切鉴定后送公司测序.结果:PCR扩增得到预期1.2 kb大小的目的片段,经SalⅠ和NotⅠ双酶切鉴定正确重组质粒送公司测序,结果与预期一致.M
农村社区转型不仅表现为基础结构之变,而且表现为治理模式之变。村级物业化治理是转型社区的重要治理方式,它在延续传统村级治理架构的基础上引入现代物业管理要素,优化了村