论文部分内容阅读
Dear Editor,The single-molecule real-time (SMRT) sequencing platform presented by Pacific Biosciences (PacBio) is regarded as a thirdgeneration sequencing technology (Eid et al.,2009;Roberts et al.,2013).PacBio delivers long reads from several to tens of kilobases (kbs),which are ideal for filling unsequenced gaps due to unusual sequence contexts,such as high-GC content or repeat-rich regions (Bashir et al.,2012;Berlin et al.,2015;Chaisson et al.,2015).PacBio long reads are also favorable for detecting large DNA fragments harboring structural variations (SVs),such as inversions,translocations,duplications,and large insertions/deletions (indels) (Ritz et al.,2010;English et al.,2014).However,one drawback of PacBio is the high error rate of base calling for single pass coverage of the genome (Au et al.,2012;Koren et al.,2012).This drawback can be mitigated by increasing sequencing coverage to achieve high consensus accuracy,but the requirements may be prohibitive for the de novo assembly of large-or medium-size genomes using only PacBio when considering both budgetary and computational costs.Alteatively,PacBio may be used for assembly improvement of near-finished reference genomes,especially for filling gaps in which unsequenced bases are represented by the letter N (English et al.,2012).Here,we combined PacBio (~15x) with Illumina reads (~40x) to improve the genome assemblies of African wild (Oryza barthii) and cultivated rice (O.glaberrima),and to infer large SVs between O.barthii and O.glaberrima.