A flexible hierarchical model for detecting DNA modifications from 3rd generation sequencing data

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:liuliea
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: DNA modifications such as DNA methylation and DNA damage can play critical regulatory roles in biological systems.High throughput DNA modification profiling can provide a more comprehensive view of the composition of DNA sequences, providing unique and useful insights for understanding biological phenomena beyond what can be discerned from As, Gs, Cs, and Ts only.Pacific Biosciencess single molecule, real time (SMRT) sequencing technology generates DNA sequences as well as DNA polymerase kinetic information, which can be used for the direct detection of DNA modifications.However, there is no statistical model exists for modeling the polymerase kinetic information.Herein, we propose a flexible hierarchical model, which can greatly improve SMRT sequencing based DNA modification detection accuracy and reduce sequencing cost.Methods: We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data) covering the same local sequence contexts of interest.We develop a flexible hierarchical model that can detect DNA modifications accurately by incorporating historical data.Results: Our results demonstrate that the hierarchical model outperforms the na(i)ve casecontrol method in which the kinetics from whole genome amplified (WGA) DNA (control) are compared to the corresponding native DNA (case) to detect kinetic variation events when a negative control sample exists.Besides, when there is no negative control sample, the hierarchical model can also achieve a reasonably good accuracy for detecting modifications that have a strong signal-to-noise ratio.Conclusions: We highlight the importance of local sequence context on 3rd-generationsequencing-based DNA modification detection.By incorporating historical data, detection accuracy can be increased and sequencing cost can be also reduced by using the proposed model .
其他文献
会议
会议
会议
会议
会议
会议
会议
  The traditional 16S rRNA sequence analysis and DNA-DNA hybridization experiment lack resolution power at the species level and below.However, in clinic prac
会议
  Single-nucleotide polymorphisms (SNPs) are recognized as one kind of major genetic variants in population scale.However, polymorphisms at the proteome level
会议
  The proteomics is an especial fountain of finding global biological principles.Proteomic datasets could provide a rich ground for the discovery of the funda
会议