To control false positives in genome-wide interaction assay an conditional entropy-based approach

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:datouuupp
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: Genome-wide analysis of gene-gene interactions has been recognized as a powerful and reasonable way to identify the missing heritability underlying complex disease that are undetectable in current single-point association assay.Recently, several data mining methods of model free in nature for detecting non-linear dependence between genetic loci have been developed, but most of them suffer from the risk of inflated false positive error (type 1 error) rate in testing the interaction hypothesis, which is particularly salient when the main effects at one or both loci are large.Methods: In this report, we propose to use two conditional entropy-based metrics to control this inflated false positive error rate, induced by the large main effects at the loci.We elucidate the mathematical relationship between the conditional mutual information and interaction metric in the context of relative risk.Further, we evaluate the performance of these metrics for identifying interaction effects under different disease models, through extensive simulations and real data applications.Results: Our results demonstrate that unconditional mutual information, although more powerful in detecting interactions, is very sensitive to main effects, particularly in additive interaction model.In comparison, the conditional mutual information metrics based on either genotype or haplotype can maintain the controlled false positive error rate without substantial loss of power for detecting gene-gene interactions in a wide range of disease models.Conclusions: In conclusion, we recommend to use the conventional unconditioned data mining metrics to identify "joint effect" of two genetic loci, followed by the use of conditional mutual information for purpose of detecting genuine "interaction effect", in the genome-wide interaction assay .
其他文献
  Rational: Cardiac conduction disease is multifactorial complex disease.Genetic factors play critical roles in formation of cardiac conductive tissue during
  A core task of drug discovery study is to identify the dependency between the genetic/ molecular makeups of the human body and disease phenotype.Here we pro
会议
  Recent studies of geothermally heated aquatic ecosystems have found widely divergent viruses with unusual morphotypes.Archaeal Viruses isolated from these h
  Background: The accumulation of knowledge on biological networks and high-throughput experimental data raises the need of robust, efficient, schematic and e
  Background: Non-structural protein 1, a highly conserved influenza virus protein, has been demonstrated previously to be a potential target for antiviral de
  Background: Knowledge of the detailed organization of nucleosomes across genomes and the mechanisms of nucleosome positioning is critical for understanding
  Backgroud: The Distal-less 3 (DLX3) is a Distal-less homeodomain protein that belongs to the members of the DLX vertebrate family.DLX3 acts as a transcripti
  Background: U6 snRNA, as a component of the spliceosomes, is involved in splicing of pre-mRNAs.There is no report about U6 in chicken.Methods: in this study
  Background: CTCF is a versatile zinc finger DNA-binding protein that functions as a highly conserved epigenetic transcriptional regulator.CTCF is known to a
  Background: Based on RNA-seq data, currently, there is a lack of satisfactory method for detecting differentially expressed genes when only a single biologi