Gclust:A Parallel Clustering Tool for Microbial Genomic Data

来源 :基因组蛋白质组与生物信息学报（英文版） | 被引量 : 0次 | 上传用户：orientaladam

【摘要】

：

The accelerating growth of the public microbial genomic data imposes substantial bur-den on the research community that uses such resources. Building databases

【作者】

：

Ruilin Li Xiaoyu He Chuangchuang Dai Haidong Zhu Xianyu Lang Wei Chen Xiaodong Li Dan Zhao Yu Zhang Xinyin Han Tie Niu Yi Zhao Rongqiang Cao Rong He Zhonghua Lu Xuebin Chi Weizhong Li Beifang Niu

【机构】

：

Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China“,”University

【出处】

：

基因组蛋白质组与生物信息学报（英文版）

【发表日期】

：

2019年5期

【关键词】

：

Microbial genome clustering Parallelization Sparse suffix array Maximal exact ma

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

The accelerating growth of the public microbial genomic data imposes substantial bur-den on the research community that uses such resources. Building databases for non-redundant ref-erence sequences from massive microbial genomic data based on clustering analysis is essential. However, existing clustering algorithms perform poorly on long genomic sequences. In this article, we present Gclust, a parallel program for clustering complete or draft genomic sequences, where clustering is accelerated with a novel parallelization strategy and a fast sequence comparison algo-rithm using sparse suffix arrays (SSAs). Moreover, genome identity measures between two sequences are calculated based on their maximal exact matches (MEMs). In this paper, we demon-strate the high speed and clustering quality of Gclust by examining four genome sequence datasets. Gclust is freely available for non-commercial use at https://github.com/niu-lab/gclust. We also introduce a web server for clustering user-uploaded genomes at http://niulab.scgrid.cn/gclust.

其他文献

Deep Learning Deciphers Protein-RNA Interaction

BackgroundrnProtein-RNA interaction is ubiquitous in cells and serves as the main mechanism for post-transcriptional regulation. RNA binding proteins (RBPs) not

期刊

I3:A Self-organising Learning Workflow for Intuitive Integrative Interpretation of Complex Genetic D

We propose a computational workflow (I3) for intuitive integrative interpretation of complex genetic data mainly building on the self-organising principle. We i

期刊

Self-organisingHuman geneticsInterpretationEvolutionMachine learning

浅谈新疆空管无线电监测系统建设

引言随着民航“十一五”建设的有序推进,机场建设和空管设备的规模不断加大,空管系统设备的运行保障能力逐步增强。同时,新辟航线和飞行流量的增加,也使得空域结构优化调整的

期刊

无线电监测飞行流量机场建设空管系统结构优化调整干扰源交叉定位电磁环境监控终端定位功能

现代测绘教育改革的若干问题探讨

随着地球空间信息学的提出及其迅速发展，传统测绘科学技术的内涵与外延都发生了巨大的变化。这必将导致高校测绘科学技术教学理念的进一步完善与教学改革向纵深发展。测绘仪器

会议

地球空间信息学对创新人才培养的新要求

随着地球空间信息学的提出及其迅速发展，传统测绘科学技术的内涵与外延都发生了巨大的变化。这必将导致高校测绘科学技术教学理念的进一步完善与教学实践向纵深发展。本文研究

会议

地球空间信息学创新教学理念测绘科学技术教学实践内涵与外延深发展高校

服务城市建设的测绘工程专业培养计划

阐述了城市化、数字城市建设以及测绘工程在城市建设过程中的作用，提出了培养服务城市规划、城市建设、城市管理的测绘工程专业应用型人才的思路，并在此基础上提出了服务城市建

会议

服务城市建设测绘工程工程专业专业培养计划应用型人才建设过程城市规划城市管理城市化思路数字基础

测绘工程专业人才培养方案的论证报告

本文首先介绍了测绘工程现状及其对人才的需求,探讨了国内测绘工程专业的调查与分析,制定了专业定位与人才培养目标。

会议

测绘工程专业人才培养方案专业定位培养目标工程专业工程现状国内调查

我校测绘工程专业培养模式改革的思考与实践

如何培养适应时代发展的测绘人才，为了培养全面的测绘人才应该如何改革专业培养模式，是摆在高校测绘专业教育者面前的一个深刻问题。本文针对测绘工程专业培养模式及教学改革，联

会议

测绘人才工程专业培养模式改革专业培养模式高校测绘专业适应时代如何培养教学改革测绘工程教育者

高等农林院校测量学教学整体优化研究与实践

通过对高等农林院校测量学教学问题的研究，以建立数字化测绘实验、实习基地为基础，以网络教学、多媒体教学、电化教学等多位一体的教学方法为手段，以数字测图、绘图、用图相结合

会议

论民族班测量教学的一些思考

民族班学生的一些特殊性，决定了对他们的授课不能采取通常手段。本文作者根据多年从事测量的教学经验，总结了民族班测量教学的一些教学经验，希望和各位同行们一起探讨。

会议

民族班学生教学经验测量教学授课

Gclust:A Parallel Clustering Tool for Microbial Genomic Data

与本文相关的学术论文