EzGenome and OrthoANI: database and bioinformatics tools for systematics of Bacteria and Archaea

来源 :第七届全国微生物资源学术暨国际微生物系统与分类学研讨会 | 被引量 : 0次 | 上传用户:wxj3177
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Next generation sequencing and accompanying bioinformatics has a great potential for microbial systematics and taxonomy.Indeed,the number of genomes released into public domain databases is being rapidly increased,albeit its utilization is hampered by the lack of adequate bioinformatics tools.EzGenome database is being developed to aim to help microbiologists in various sub-disciplines with curated databases and efficient bioinformatics tools.In this talk,I will explain the bioinformatics background of EzGenome database and new algorithm for calculating average nucleotide identity for the taxonomic use.OrthoANI algorithm and related bioinformatics tools.Microbial taxonomy serves as a fundamental framework for all microbiological disciplines,and in particular,the species concept of Bacteria and Archaea is of premium importance.Species demarcation in Bacteria and Archaea has been mainly based on overall genome relatedness.Current practice of obtaining these values between two strains is shifting from experimentally determined similarity that is usually obtained by DNA-DNA hybridization(DDH)to genome sequence-based similarity.Average nucleotide identity(ANI)is a simple algorithm that mimics DDH in which the genome sequence of a query strain is divided into 1,020bp-long fragments and compared by BLASTN program against the whole genome sequence of a subject strain which was not fragmented.Because of its algorithmic nature,ANI values between two genome sequences can be calculated in a reciprocal manner.General practice is to obtain the average value between the two reciprocal ANI calculations,though these values may be different each other significantly.We examined a large set of reciprocal ANI values of closely related species and found that 55%exhibited over 0.1%discrepancy between the reciprocal ANI values.Moreover,1,101 pairs showed discrepancy higher than 1%with the highest being 4.15%difference Given that 95~96%ANI values are considered as the species boundary,this level of discrepancies is significant enough to affect our taxonomic interpretation.To resolve this problem,we have developed new ANI algorithm,named "OrthoANI",to include the concept of orthology.From a large scale calculation,we found that reciprocal OrthoANl values are always almost identical;average discrepancy is 0.00042%with the maximum of 0.05%,overcoming limitation of the original ANI algorithm.The correlation between the original ANI and OrthoANI is very high,so the same range(95~96%)of OrthoANI values can be used as the species demarcation cutoff instead of the original ANI.It is,therefore,fair to say that our OrthoANI algorithm resolves the reciprocal inconsistency of the original ANI and provides a more robust way of calculating the similarity between two genome sequences for the taxonomic use.Two different types of software for calculating OrthoANI are available at http://www.ezbiocloud.net/sw/oat.
其他文献
阔盘属(Eurytrema)吸虫是寄生在牛、羊、骆驼、猪、猕猴等以及人体胰脏中的一类寄生虫,世界性分布。在上世纪七十年代以前,福建省寄生虫学工作者对本省牛羊胰脏吸虫病原生物学
  南瓜是雌雄异花的短日植物,长日照造成雌花分化减少,从而直接影响南瓜产量。因此南瓜在中国南方地区的适宜种植期短,仅从2月至4月,而光周期不敏感的南瓜育种材料的应用前景广
目的:目前,癌症依然是困扰人类健康的重大疾病。而在与癌症相关的死亡病例中,肺癌是主要的致死性癌症之一,其中非小细胞肺癌(Non-small Cell Lung Cancer,NSCLC)占据肺癌的85%左
  枯萎病是节瓜主要病害之一.创制抗病资源,发掘抗性相关基因并进行抗性机理研究,对于节瓜抗病育种具有重要意义.本课题组前期研究中在节瓜抗枯萎病品系中发现部分抗性相关序
罗浮锥(Castanopsis fabri)分布于长江以南各省,为中国亚热带常绿阔叶林的常见树种。在戴云山国家级自然保护区分布有较大面积的罗浮锥天然林,故开展罗浮锥林的研究对于完善福
依赖于5’-AMP激活的蛋白激酶(AMPK)是一种苏氨酸/丝氨酸激酶,被称为“细胞能量代谢的主要调节器”。细胞能量代谢失衡,会引起代谢综合征等多种疾病,严重影响人类的健康。作为能
  南瓜肌醇是南瓜果实中降血糖和降血脂的主要活性功能成分之一,用于保健品、药物和食品添加开发。本研究中开展了南瓜肌醇合成分子机理的研究。选取遗传多样性鲜明的30份
会议
  果皮颜色是影响瓜类蔬菜商品和感官价值的最重要性状之一,对冬瓜品种选育具有重要意义。本研究中以墨绿色果皮材料与黄色果皮材料为亲本构建的F2群体,利用基于冬瓜转录组测
  前期研究证明黄瓜果实中的Cs WRKY30基因在农药霜霉威胁迫下呈现明显地上调表达,对该基因的分离和功能鉴定将有助于揭示其在黄瓜果实霜霉威降解过程中的作用机制,为减少农
  硒(Selenium)是人体必需的重要微量元素,人体长期硒摄入不足会诱发大骨节病、克山病等。黄瓜作为一种生食蔬菜,避免了烹制过程中硒元素的损耗或有机硒向无机硒的转化。因此