论文部分内容阅读
数据中心的工作负载要求高计算能力、灵活性、能效和低成本。同时改善所有这些因素非常具有挑战性。为了推进数据中心超出商用服务器设计可以提供的能力,我们设计并构建了一个可组合、可重构的结构,来加速部分大型软件服务。结构的每个实例都由嵌入48台机器的半机架6×8二维环面高端Stratix V FPGA组成。每台服务器中都放置了一个FPGA,可以通过PCIe来访问,并用成对的10 Gb SAS电缆直接连接到其他FPGA。在本文中,我们描述了这种结构在1,632台服务器组(bed)的中等规模的部署,并测量了其在加快Bing搜索引擎方面的功效。我们描述了系统的要求和架构,详细介绍了关键技术挑战,以及使系统在出现故障情况下保持鲁棒性所需的解决方案,还有根据排名候选文档测量的系统性能、功率和弹性。在高负载时,一定范围内的可重构结构提高了每台服务器固定延迟分布的吞吐量排名95%——或在保持等效吞吐量时,减少29%的尾延迟。
Workloads in the data center require high computing power, flexibility, energy efficiency and low cost. It is very challenging to improve all these factors at the same time. To drive the data center beyond the capabilities that commercial server designs can provide, we designed and built a composable, reconfigurable architecture to speed up some of the largest software services. Each instance of the fabric consists of a half-rack, 6x8, two-dimensional torus high-end Stratix V FPGA embedded in 48 machines. An FPGA is placed in each server and can be accessed via PCIe and connected directly to other FPGAs using paired 10 Gb SAS cables. In this article, we describe the medium-scale deployment of this structure on 1,632 server farms and measure its effectiveness in accelerating Bing search engines. We describe the system requirements and architecture, detail the key technical challenges, and the solutions required to keep the system robust in the event of a failure, as well as system performance, power, and resiliency based on rank candidate documents. At high loads, a range of reconfigurable fabrics improves the 95% throughput per server fixed latency profile - or a 29% reduction in tail delay while maintaining equivalent throughput.