论文部分内容阅读
Background: Genome-wide analysis of gene-gene interactions has been recognized as a powerful and reasonable way to identify the missing heritability underlying complex disease that are undetectable in current single-point association assay.Recently, several data mining methods of model free in nature for detecting non-linear dependence between genetic loci have been developed, but most of them suffer from the risk of inflated false positive error (type 1 error) rate in testing the interaction hypothesis, which is particularly salient when the main effects at one or both loci are large.Methods: In this report, we propose to use two conditional entropy-based metrics to control this inflated false positive error rate, induced by the large main effects at the loci.We elucidate the mathematical relationship between the conditional mutual information and interaction metric in the context of relative risk.Further, we evaluate the performance of these metrics for identifying interaction effects under different disease models, through extensive simulations and real data applications.Results: Our results demonstrate that unconditional mutual information, although more powerful in detecting interactions, is very sensitive to main effects, particularly in additive interaction model.In comparison, the conditional mutual information metrics based on either genotype or haplotype can maintain the controlled false positive error rate without substantial loss of power for detecting gene-gene interactions in a wide range of disease models.Conclusions: In conclusion, we recommend to use the conventional unconditioned data mining metrics to identify "joint effect" of two genetic loci, followed by the use of conditional mutual information for purpose of detecting genuine "interaction effect", in the genome-wide interaction assay .