论文部分内容阅读
Background: With the advance of single nucleotide polymorphism (SNP) genotyping arrays, there is increasing interest in using this powerful technology for profiling chromosomal aberrations in tumors.However, currently a critical issue hampering further applications of SNP-array in cancer research lies in the fact that cancer is often heterogeneous with multiple sub-clones that exhibit distinct biological characteristic including chromosomal aberrations.In addition, cancer heterogeneity is of great interest to oncologists as study on it may shed light on the origin of cancer and carcinogenesis.Methods: To address the issue of cancer heterogeneity, we developed a novel statistical model named CHASE (Cancer Heterogeneity and Chromosomal Aberrations from SNP-array Experiments) based on the framework of global parameter hidden Markov model (HMM).By quantitatively delineating the genomic similarity and discrepancy between two cancer sub-clones using global parameters representing the percentage of different sub-clones, we generated empirical emission probability density functions and incorporated them with the HMM.For modeling fitting and parameter estimation, ECM algorithm was adopted in CHASE and formulas were derived for parameter updating.Results: Test on both simulated and real SNP-array data showed that CHASE can automatically determine cancer heterogeneity from the results of SNP-array experiment.For each cancer sub-clone, various kinds of chromosomal aberrations, such as amplification, deletion, LOH, etc, can be precisely discovered by CHASE.For example, by applying CHASE to breast cancer sample "BLC_B1_T45" that was manually identified as heterogeneous, we successfully identified two cancer sub-clones with two chromosomal regions exhibiting distinct chromosomal aberrations.Conclusions: To the best of our knowledge, CHASE is currently the most powerful computational methods that can discover cancer heterogeneity and chromosomal aberrations for each sub-clone from SNP-array data .