A New Method of Short Read Mapping

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户：lrdg

【摘要】

：

　　Background: Next Generation Sequencing (NGS) methodology has dramatically increased the sequencing datasets and enabled novel biological applications.NGS ha

【作者】

：

Shi Jian Chen An Qi Wang Lei Li

【机构】

：

AcademyofMathematicsandSystemsScience,ChineseAcademyofSciences,Beijing100190,China

【出处】

：

第五届全国生物信息学与系统生物学学术大会

【发表日期】

：

2012年8期

【关键词】

：

Next Generation Sequencing re-sequencing short read alignment

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Background: Next Generation Sequencing (NGS) methodology has dramatically increased the sequencing datasets and enabled novel biological applications.NGS has been applied to many areas such as RNA-Seq, CHIP-Seq and MeDIP-Seq.In whole genome re-sequencing projects of mammalian genomes, NGS usually generates billions of short read sequences (short reads).The computational cost of aligning such sequences can be very large.Methods: We propose a new method of aligning Illumina short reads to the reference genome.The novelty of our method lies in its way of indexing the reference.We transform all K-mer subsequences of the reference sequence into natural numbers and use a fixed function to randomize all of them.We then sort these randomized numbers to form an index table.During mapping process, the K-mer subsequences of short reads are transformed into numbers and then randomized in exactly the same way of those K-mer subsequences of the reference.We make use of the statistical character of the index table to speed up the process of inserting the randomized numbers to the index table.If a K-mer subsequence is inserted to the index table (we can call it a seed), we then use seed-and-extend method to make the full alignment.Results: We compare our method with Bowtie2 and SOAP2 respectively using 3 short read data sets.The alignment rate of our method is comparable with Bowtie2 and about 10% more than SOAP2.The speed of our method is 2 to 5 times faster than Bowtie2 and comparable with SOAP2.Conclusions: Unlike complex data structures such as Hash table or FM-index, our index table is simply the index of sorted and randomized nature numbers of K-mer subsequences of the reference.Whats more, we can make use of the statistical character of the index table to speed up the process of finding exact matches.Our method is fast and flexible due to the character of the randomized and sorted index table.The result turns out that the performances of speed and sensitivity of our method is comparable or better than Bowtie2 and SOAP2 .

其他文献

福建绞股蓝皂苷的微生物转化研究

会议

紫威科植物猫爪藤体外抗肿瘤有效部位与化学成分

会议

Inhibitory effect and molecular mechanisms of corilagin on hepatocellular carcinoma

会议

注射用红色诺卡氏菌细胞壁骨架生产菌的研究

会议

造纸树脂障碍微生物酶处理剂的研究

会议

Is Shigella a strain of the E . Cole species or a species of the Escherichia genus

　　The traditional 16S rRNA sequence analysis and DNA-DNA hybridization experiment lack resolution power at the species level and below.However, in clinic prac

会议

Molecular Bases for Population Variation from SNP to SAP

　　Single-nucleotide polymorphisms (SNPs) are recognized as one kind of major genetic variants in population scale.However, polymorphisms at the proteome level

会议

Proteomics Data-Driven Theoretical Biology Research

　　The proteomics is an especial fountain of finding global biological principles.Proteomic datasets could provide a rich ground for the discovery of the funda

会议

A flexible hierarchical model for detecting DNA modifications from 3rd generation sequencing data

　　Background: DNA modifications such as DNA methylation and DNA damage can play critical regulatory roles in biological systems.High throughput DNA modificati

会议

single molecule sequencingDNA modificationDNA polymerase kineticsequence cont

A comparative profile of miRNAs in posterior silk gland of silkworm Bombyx mori by solexa deep-seque

　　Background: As a large class of endogenous and small non-coding RNAs, miRNAs (miRNAs) play fundamental roles in multiple biological processes.With the devel

会议

microRNAdeep sequencingmicroarraysilkworm Bombyx mori

A New Method of Short Read Mapping

与本文相关的学术论文