论文部分内容阅读
尽管谷子(Setaria italica)全基因组序列图谱已经公布,但其基因注释很不完善。为此,本文应用RNA-Seq技术开展了谷子新基因发掘和已注释基因结构优化工作。以‘晋谷21’谷子叶片为材料提取总RNA,构建测序文库并利用Illumina HiSeq2500测序平台进行双端测序,最终获得37 072 949条高质量的干净读段(clean reads)。将其进一步与‘豫谷1号’谷子参考基因组进行序列比对,鉴定出614个新基因。在此基础上,利用COG、GO、KEGG、Swiss-Prot和NR等数据库对其进行了功能注释,获得了438个新基因的注释信息。此外,还优化了7 175个已注释基因的结构,延伸了4 330个基因的5′端和5 362个基因的3′端。本研究旨在为后续谷子功能基因组学研究和其他生物基因组注释信息的完善提供有益的借鉴。
Although the whole genome sequence of Setaria italica has been published, its gene annotation is poor. To this end, the application of RNA-Seq technology to carry out a new gene discovery of millet and annotation gene structure optimization work. Total RNA was extracted from the leaves of Jinyou 21 ’millet using the Illumina HiSeq2500 Sequencing Platform, and 37 072 949 high-quality clean reads were finally obtained. A further sequence alignment was performed with the reference genome of ’Yugu 1’, and 614 new genes were identified. Based on this, functional annotations were made using databases such as COG, GO, KEGG, Swiss-Prot and NR, and 438 new gene annotations were obtained. In addition, the structure of 7 175 annotated genes was optimized, extending the 5 ’end of 4 330 genes and the 3’ end of 5 362 genes. The purpose of this study is to provide useful references for the follow-up improvement of cereal functional genomics and other biological genome annotation information.