ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

来源 :计算机科学技术学报(英文版) | 被引量 : 0次 | 上传用户:liwj
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Scientific applications at exascale generate and analyze massive amounts of data. A critical requirement of these applications is the capability to access and manage this data efficiently on exascale systems. Parallel I/O, the key technology enables moving data between compute nodes and storage, faces monumental challenges from new applications, memory, and storage architectures considered in the designs of exascale systems. As the storage hierarchy is expanding to include node-local persistent memory, burst buffers, etc., as well as disk-based storage, data movement among these layers must be efficient. Parallel I/O libraries of the future should be capable of handling file sizes of many terabytes and beyond. In this paper, we describe new capabilities we have developed in Hierarchical Data Format version 5 (HDF5), the most popular parallel I/O library for scientific applications. HDF5 is one of the most used libraries at the leadership computing facilities for performing parallel I/O on existing HPC systems. The state-of-the-art features we describe include: Virtual Object Layer (VOL), Data Elevator, asynchronous I/O, full-featured single-writer and multiple-reader (Full SWMR), and parallel querying. In this paper, we introduce these features, their implementations, and the performance and feature benefits to applications and other libraries.
其他文献
对某2×1 000 MW二次循环湿冷火电机组的循环水系统、循泵配置方案进行了技术经济比较,结果显示一机两泵(扩大单元制)方案更优;为使循环水泵能达到经济节能运行的目的,对火力
带式输送机是井下采矿作业的一个重要环节,是控制煤矿成本、保证利润的关键所在.从设计理念、技术团队、运营过程等方面介绍了美国伊利诺斯煤炭运营商2个带式输送设备的应用
期刊
期刊
With the convergence of high-performance computing (HPC), big data and artificial intelligence (AI), the HPC community is pushing fortriple usesystems to expedi
It is hard for applications to make full utilization of the peak bandwidth of the storage system in high-performance computers because of I/O interferences, sto
期刊
Burst buffer has become a major component to meet the I/O performance requirement of HPC bursty traffic. This paper proposes Gfarm/BB that is a file system for
Storage backends of parallel compute clusters are still based mostly on magnetic disks, while newer and faster storage technologies such as flash-based SSDs or
Technology enhancements and the growing breadth of application workflows running on high-performance computing (HPC) platforms drive the development of new data