Towards Auction-Based HPC Computing in the Cloud

来源 :Computer Technology and Application | 被引量 : 0次 | 上传用户:wuliaoaiaia
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Abstract: Cloud computing is expanding widely in the world of IT infrastructure. This is due partly to the cost-saving effect of economies of scale. Fair market conditions can in theory provide a healthy environment to reflect the most reasonable costs of computations. While fixed cloud pricing provides an attractive low entry barrier for compute-intensive applications, both the consumer and supplier of computing resources can see high efficiency for their investments by participating in auction-based exchanges. There are huge incentives for the cloud provider to offer auctioned resources. However, from the consumer perspective, using these resources is a sparsely discussed challenge. This paper reports a methodology and framework designed to address the challenges of using HPC(High Performance Computing) applications on auction-based cloud clusters. The authors focus on HPC applications and describe a method for determining bid-aware checkpointing intervals. They extend a theoretical model for determining checkpoint intervals using statistical analysis of pricing histories. Also the latest developments in the SpotHPC framework are introduced which aim at facilitating the managed execution of real MPI applications on auction-based cloud environments. The authors use their model to simulate a set of algorithms with different computing and communication densities. The results show the complex interactions between optimal bidding strategies and parallel applications performance.
  Key words: Auction-based cloud computing, fault tolerance, cloud HPC (high performance computing).
  1. Introduction
  The economy of scale offers cloud computing virtually unlimited cost effective processing potentials. While it is in general difficult to assess the real cost of a computation task, the auction-based provisioning scheme offers a reasonable pricing structure. Theoretically, prices under fair market conditions should reflect the most reasonable costs of computations. The fairness is ensured by the mutual agreements between the sellers and the buyers.
  From the consumer’s perspective, among all computing applications, High Performance Computing(HPC) applications are the biggest potential beneficiaries. From the seller’s perspective, HPC applications represent the most reliable income stream since they are the most resource intensive users. Theoretically, resource usage efficiency is also maximized under the auction-based provisioning schemes.
  Traditional HPC applications are typically optimized for hardware features to obtain processing efficiency. Since transient component errors can halt the entire application, it has become increasingly important to create autonomic applications that can automate checkpoint and re-starting with little loss of useful work. Although the existing HPC applications are not suitable for volatile computing environments, with an automated Checkpoint-Restart (CPR) HPC toolkit, it is plausible that practical HPC applications could gain additional cost advantages using auction-based resources by dynamically minimized CPR overheads.
  Unlike existing MPI (Message Passing Interface) fault tolerance tools, the authors emphasize on dynamically adjusting optimal CPR intervals in order to offset the large number of out-bid failures typical in the volatile auction-based computing platform. The authors introduce a formal model and a HPC application toolkit, SpotHPC, to facilitate the practical execution of real MPI applications on volatile auction-based cloud platform.
  In section 2, the background and context of the current research are described. In section 3, the authors establish models for estimating running times of HPC applications using auction-based cloud resources. The proposed models take into account the time complexities of the HPC application, the overheads of checkpoint-restart, and the publicly available resource bidding history. They seek to unravel the inter-dependencies between the applications’computing/communication complexities, the number of required processors, bidding prices and the eventual processing costs. The authors then introduce the SpotHPC toolkit and show how it can automate MPI application processing using volatile resources and the guidance of the formal models. In section 4, the proposed models are applied to recent bidding histories of Amazon EC2 HPC resources. Preliminary results for two HPC application types with different computing and communication complexities are reported. Conclusions are given in section 5 and give potential future research directions.
  2. Background
  2.1 Auction-Based Computing: Spot Instances
  Amazon is one of the first cloud computing vendors to provide at least two types of cloud instances: on-demand instances and spot instances. An on-demand instance has a fixed price. Once ordered, it provides service according to Amazon’s Service Level Agreement (SLA). A spot instance is a type of resource whose availability is controlled by the current bidding price and the auctions market.
  There are unique characteristics of the auction-based computing platform. First, a stable computing environment can potentially be established by bidding the on-demand prices. Lower costs can be gained if the applications can tolerate partial failures. Thus, the most fault resilient implementation will gain the best possible cost effectiveness. Third, given an application, its required processing time and/or budget requirements, as well as the bidding history of required resources, it is possible to develop an optimized bidding strategy to meet the desired target(s).
  There are three special features of Amazon’s spot instance pricing policy:
  A successful bid does not guarantee the exclusive resource access for the entire requested duration. The Amazon engine can terminate access at any time if a higher bid is received;
  Amazon does not charge a partial hour (job terminated before reaching the hour boundary) if the termination is caused by out-bidding. Otherwise, the partial hour is charged in full if the user terminates the job;
  Amazon will only charge the user the highest market price that may be less than the user’s successful bid.
  The authors have chosen two types of Amazon EC2 HPC resources for this study. The cc1.4xlarge and the cg1.4xlarge are cluster HPC instances that provide cluster level performance (23 GB of memory, 8 cores, 10 Gigabit Ethernet). The main difference is the presence of GPUs (Graphical processing units, 2 x NVIDIA Tesla “Fermi” M2050) in the cg1.4xlarge which provides more power for compute intensive applications (Table 1).
  Fig. 1 records a sample market price history for the cc1.4xlarge instance type from May 5 to May 11, 2011. This instance type shows typical user behavior for more legacy HPC applications. The cg1.4xlarge instance type illustrates resources for HPC applications that can benefit from GPU processing. Since many legacy HPC applications are not yet suitable for GPU processing, cg1.4xlarge pricing history sees less fluctuations.
  2.2 HPC in the Cloud
  Although HPC applications are the biggest potential beneficiaries of cloud computing, except for a few simple applications, there are still many practical concerns:
  Most mathematical libraries rely on optimized numerical codes that exploit common hardware features for extreme efficiencies. Some of the hardware features are not mapped in virtual machines. Hardware cache is one of the examples. Consequently, HPC applications suffer more performance drawbacks in addition to the normal virtualization overhead;
  Many HPC applications have high inter-processor communication demand. Current virtualized networks have difficulty meeting these high demands;
  All existing HPC applications handle only two communication states: success and failure. While success is a reliable state, failure is not. Existing applications treat timeout identical to failure. Consequently, any transient component failure can halt the entire application. Using volatile spot instances for these applications is a serious challenge.
  Initially, low-end cloud services provided little guarantee on the deliverable performance for HPC applications. Recently, high end cloud resources have been developed specifically for HPC applications. These improvements have demonstrated hopeful features [1-3]. These improvements show the diminishing overhead of virtual machine monitors such as Xen [4]. Due to the severity of declining MTBFs, fault tolerance for MPI applications has also progressed. These developments inspired the design and development of SpotHPC.
  2.3 Checkpoint-Restart (CPR) MPI Applications
  Much research has been done in the past to provide seamless fault tolerance for MPI applications. FT-MPI[5] uses interactive process fault tolerance. Starfish [6] supports a number of CPR protocols and LA-MPI [7] provides message level failure tolerance. Egida [8] experimented with message logging grammar specification as means for fault tolerance. Cocheck [9] extends the Condor [10] scheduler to provide a set of fault tolerance methods that can be used by MPI applications. The authors choose the OpenMPI’s coordinated CPR because of its cluster wide checkpoint advantage since more fine grained strategies will not work in this highly volatile environment [11-13].
  The challenge faced by applications willing to use the volatile nature of spot instances is of a different kind than regular clusters. The behavior of the spot instances can be analyzed as a fail-stop mechanism. The Amazon engine uses the market information and the user’s bid to terminate an instance with no prior notice [14]. This is called an out-of-bid failure. These out-of-bid failures require applications to adapt their run time to be frequently interrupted.
  While there are many different fault tolerance libraries for MPI, the authors choose the Open MPI’s coordinated CPR mechanism [15]. The high volatility of spot instance platform makes many fine grained checkpoint strategies impractical. These include local checkpoints [11], multilevel checkpoints [12] and using pairs or groups of nodes to provide redundancy[13]. The OpenMPI’s coordinated CPR is an all-or-nothing single tasks CPR interface, although with higher overheads, that guarantees to work correctly regardless the number of processors. Other single task CPR efforts involving spot instances include map-reduce applications [16-17]. A map-reduce application does not require inter-task communications. Parallel processing can be controlled external to the individual tasks. Therefore, spot instances can be used as “accelerators” via a simple job monitor that tracks and restarts dead jobs automatically[18]. Another noticeable effort studying the spot instances includes Refs. [19-20]. These are based on simulating the behavior of a single instance under different bids. Their work outlined the inherent tradeoff between completion time and budget. In Ref. [19], a decision model is proposed that describes a simulator that is able to determine under a set of conditions the total expected time of a single application. Another study, Ref. [20], discussed a set of checkpoint strategies that can maximize the use of the spot instances while minimizing the costs.
  Resource allocation strategies are also identified in Refs. [21-22]. This research work uses monetary and runtime estimation techniques to simulate the runtime of grid-based applications on such volatile infrastructure. These experiments also provide a heuristical study of the effect/benefits of generic fault tolerant techniques such as checkpointing, task duplication and migration.
  Additionally the research carried out in Ref. [23], explores the spot instance markets and captures the long term behavior of spot instances pricing. This includes the distributions of the prices, inter-price times as well as the difficulties related to fitting analytical distributions to a young market where sparse variations exist so far. In addition, the research in Ref.[24], points out the existence of different epochs in the pricing behavior and implications for using the spot price data. This report contributes to the understanding of the usefulness of the data that is publicly available and how decision models need to be aware of changes in pricing policies and the suppliers’ announcements concerning new prices and new pricing regions.
  While these research effort have clarified some of the challenges of spot instances, only few, such as Ref.[25], have touched specifically on HPC applications and how even the nature of the application, being high performance or high throughput, impacts the fault tolerance strategies used to mitigate the interruptions due to market fluctuations. While describing strategies for single applications is crucial to the understanding of the spot resources and auction-based computing in general, this knowledge is not fully usable in the context of HPC computing and there are many issues that need examination when applications are meant to scale to higher orders of magnitudes.
  To the best of the authors’ knowledge, there has been no direct evaluation of practical MPI applications on spot instances. The volatile auction-based computing platform challenges the established HPC programming practices.
  2.4 Evaluating MPI Applications Using Auction-Based Platforms
  For HPC applications using large number of processors, the CPR overhead is the biggest cost factor. Without CPR optimization, it is unlikely to gain practical acceptance for MPI application to run on the volatile auction-based platforms.
  The authors report a theoretical model based on application resource time complexities [26] and optimal CPR models [27-28]. In addition, they describe a toolkit named SpotHPC that can support autonomic MPI application using spot instance clusters. This toolkit can monitor spot instances and bidding prices, automate checkpointing at bidding price (and history) adjusted optimal intervals and automatically restart application after out-bidding faults.
  3. Theoretical Model
  The auction prices vary dynamically depending on the supply and demand in the Amazon market place.
  References
  [1] L. Youseff, R. Wolski, B. Gorda, C. Krintz, Evaluating the performance impact of Xen on MPI and process execution for HPC systems, in: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, 2006, p. 1.
  [2] C. Vecchiola, S. Pandey, R. Buyya, High-performance cloud computing: A view of scientific applications, in: Proceedings of 10th International Pervasive Systems, Algorithms, and Networks (ISPAN), 2009, pp. 4-16.
  [3] A. Iosup, S. Ostermann, N. Yigitbasi, R. Prodan, T. Fahringer, D. Epema, Performance analysis of cloud computing services for many-tasks scientific computing, IEEE Transactions on Parallel and Distributed Systems 22 (6) (2011) 931-945.
  [4] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, A. Warfield, Xen and the art of virtualization, in: Proceedings of the 19th ACM Symposium on Operating Systems Principles, 2003, pp. 164-177.
  [5] G.E. Fagg, J. Dongarra, FT-MPI: Fault tolerant MPI, supporting dynamic applications in a dynamic world, in: Proceedings of the 7th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface, 2000, pp. 346-353.
  [6] A. Agbaria, R. Friedman, Starfish: fault-tolerant dynamic MPI programs on clusters of workstations, in: Proceedings of the 8th International Symposium on High Performance Distributed Computing, 1999, pp. 167-176.
  [7] R. Graham, S. Choi, D. Daniel, N. Desai, R. Minnich, C. Rasmussen, L. Risinger, M. Sukalski, A network-failure-tolerant message-passing system for terascale clusters, International Journal of Parallel Programming 31 (4) (2003) 285-303.
  [8] S. Rao, L. Alvisi, H. Viny, Egida: An extensible toolkit for low-overhead fault-tolerance, in: 29th Annual International Symposium on Fault-Tolerant Computing, Digest of Papers. IEEE, 1999, pp. 48-55.
  [9] G. Stellner, Cocheck: Checkpointing and process migration for MPI, in: Proceedings of the 10th International Parallel Processing Symposium (IPPS ’96), IEEE Computer Society, 1996, pp. 526-531.
  [10] M. Litzkow, T. Tannenbaum, J. Basney, M. Livny, Checkpoint and migration of Unix processes in the condor distributed processing system, Technical Report, 1997.
  [11] J. Hursey, J. M. Squyres, T.I. Mattox, A. Lumsdaine, The design and implementation of checkpoint/restart process fault tolerance for open MPI, in: Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007, pp. 1-8.
  [12] A. Moody, G. Bronevetsky, K. Mohror, B.R. Supinski, Design, modeling, and evaluation of a scalable multi-level checkpointing system, in: Proceedings of International High Performance Computing, Networking, Storage and Analysis (SC) Conference, 2010, pp. 1-11.
  [13] G. Zheng, L. Shi, L. Kale, FTC-Charm++: An in-memory checkpoint-based fault tolerant runtime for Charm++ and MPI, in: Proceedings of 2004 IEEE International Conference on Cluster Computing, 2004, pp. 93-103.
  [14] Amazon HPC Cluster Instances, 2011, available online at: http://aws.amazon.com/ec2/hpcapplications/.
  [15] J. Hursey, Coordinated checkpoint/restart process fault tolerance for MPI applications on HPC systems, Ph.D. Dissertation, Indiana University, Bloomington, IN, USA, July 2010.
  [16] J. Dean, S. Ghemawat, MapReduce: Simplified data processing on large clusters, Communications of the ACM 51 (1) (2008) 107-113.
  [17] D. Borthakur, The Hadoop Distributed File System: Architecture and Design, 2007, available online at: http://developer.yahoo.com/hadoop/tutorial/.
  [18] N. Chohan, C. Castillo, M. Spreitzer, M. Steinder, A. Tantawi, C. Krintz, See spot run: Using spot instances for MapReduce workflows, in: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, USENIX Association, 2010, p. 7.
  [19] A. Andrzejak, D. Kondo, S. Yi, Decision model for cloud computing under SLA constraints, in: Proceedings of IEEE International Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2010, pp. 257-266.
  [20] S. Yi, D. Kondo, A. Andrzejak, Reducing costs of spot instances via checkpointing in the Amazon elastic compute cloud, in: 2010 IEEE 3rd International Conference on Cloud Computing, 2010, pp. 236-243.
  [21] W. Voorsluys, R. Buyya, Reliable provisioning of spot instances for compute-intensive applications, arXiv:1110.5969v1, 2011.
  [22] W. Voorsluys, S. Garg, R. Buyya, Provisioning spot market cloud resources to create cost-effective virtual clusters, in: Proceedings of the 11th International Conference on Algorithms and Architectures for Parallel Processing, 2011, pp. 395-408.
  [23] B. Javadi, R. Buyya, Comprehensive statistical analysis and modeling of spot instances in public cloud environments, Technical Report CLOUDS-TR-2011-1, The University of Melbourne, 2011.
  [24] O. Ben-Yehuda, M. Ben-Yehuda, A. Schuster, D. Tsafrir, Deconstructing Amazon EC2 spot instance pricing, Technion-Israel Institute of Technology, Technical Report CS-2011-09, 2011.
  [25] M. Taifi, J. Shi, A. Khreishah, SpotMPI: A framework for auction-based HPC computing using Amazon spot instances, in: Proceedings of ICA3PP, 2011, pp. 109-120.
  [26] J. Shi, Program scalability analysis, in: International Conference on Distributed and Parallel Processing, Georgetown University, Washington D.C., October 1997.
  [27] J. Daly, A higher order estimate of the optimum checkpoint interval for restart dumps, Future Generation Computer Systems 22 (3) (2006) 303-312.
  [28] J. Young, A first order approximation to the optimum checkpoint interval, Communications of the ACM 17 (9)(1974) 530-531.
  [29] Q. Zhang, E. Gurses, R. Boutaba, J. Xiao, Dynamic resource allocation for spot markets in clouds, in: Proceedings of the 11th USENIX Conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services, 2011.
  [30] W. Gropp, E. Lusk, Fault tolerance in MPI programs, Special Issue of International Journal of High Performance Computing Applications 18 (2002) 363-372.
  [31] P. Hargrove, J. Duell, Berkeley lab checkpoint/restart(BLCR) for Linux clusters, Journal of Physics: Conference Series 46 (2006) 494-499.
  [32] M. Taifi, HPCFY-Virtual HPC Cluster Orchestration Library, available online at: https://github.com/moutai/HPCFY.
  [33] M. Massie, B. Chun, D. Culler, The ganglia distributed monitoring system: design, implementation, and experience, Parallel Computing 30 (7) (2004) 817-840.
  [34] K. Blathras, D. Szyld, Y. Shi, Timing models and local stopping criteria for asynchronous iterative algorithms, Journal of Parallel and Distributed Computing 58 (3) (1999) 446-465.
  [35] G. Amdahl, Validity of the single processor approach to achieving large scale computing capabilities, in: Proceedings of the April 18-20, 1967, Spring Joint Computer Conference (AFIPS ’67), ACM, pp. 483-485.
  [36] J. Shi, M. Taifi, A. Khreishah, J. Wu, Sustainable GPU computing at scale, in: 14th IEEE International Conference in Computational Science and Engineering, 2011, pp. 263-272.
其他文献
2月,新年,情人节。。。  缤纷假期接连不断,换上美美春装的时候,也别忘了释放双手。小编和你一起走进美甲潮流趋势,浏览最HIT的甲款,走起全新styel,点亮早春的指尖风情!
期刊
手套、打底裤…都是这个冬天の不可或缺的小物!  冬季装扮必备的潮流小物为你——揭开,花些心思巧妙搭配让你亮起来,特别的细节才能让你在厚重的冬日脱颖而出哦!
期刊
结合品牌和爆款搜集的一个盘点栏目.从单品、达人推荐及搭配上全面为你整理收集,让你在春节之前准备好完美搭配度过重生后的农历新年哦!
期刊
今年特别流行粗呢料子的短款外套的搭配感觉,无论搭配连衣裙还是高腰款单品都让腰线往上移,百搭能力超强。
期刊
初春的甜美女孩造型
期刊
二月包包品牌大集合,清新自然の治愈系、甜美可爱の淑女派、时尚利落の都市女郎、玩味街头の酷感潮人还有年轻活力の休闲范,哪种才是你最爱的style?挑上一款表达自己の态度吧!
期刊
美丽无需复杂,简简单单也能演绎时尚潮流★
期刊
想在新春聚会中给人靓丽一新的感觉?小编为你搜罗初春人气单品搭出不同场合的风格让你无时无刻都应对自如!  ★更多详情请登陆《coco薇》潮流旗舰店:www.vico.cn
期刊
冷色调的背景下同样冷色调的飘逸清新小文艺说是一见倾心一点都不为过。竹编织的手提箱.白底的小碎花头巾.每一处细节都透露出文艺的小心思。
期刊
长假悠悠总让人情不自禁的购物欲膨胀。
期刊