论文部分内容阅读
多核系统中末级Cache是影响整体性能的关键。为了提出一种细粒度、低延迟、低代价的末级共享Cache资源管理机制,将系统性能目标转换为每个内核当前占用Cache资源的替换概率,以决定每个内核能够提供的被替换资源的数量;对某个需要增加Cache资源的内核,从可提供被替换资源的候选内核中选出距离较近且替换概率较高的一个内核,并以Cache块为粒度进行替换,从而实现Cache资源在不同内核间的动态划分。与传统以相联度为粒度的粗粒度替换机制相比,以Cache块为单位的替换机制具有更细的替换粒度,灵活性更高。另外,通过将位置信息和替换概率结合,保证了Cache资源与相应内核在物理布局上的收敛,降低了访问延迟。同时,所提出的方法只需要增加极少的硬件代价。实验结果表明,根据实验场景和对比对象的不同,所提方法与其他已有研究成果相比,可以实现从6.8%到22.7%的性能提升。
Last-level caches in multi-core systems are the key to overall performance. In order to propose a fine-grained, low-latency and low-cost end-of-class shared Cache resource management mechanism, the system performance goal is converted to the replacement probability of each core currently occupying Cache resources to determine the resources that each core can provide for replacement The number of cores that need to increase the Cache resource is selected from the candidate cores that can provide the replaced resources and a kernel with a higher probability of replacement is replaced with a Cache block as a granularity so as to implement Cache resources in Dynamic partition between different kernels. Compared with the traditional coarse-grained replacement mechanism with associative degree of granularity, the replacement mechanism based on Cache block has finer granularity of replacement and higher flexibility. In addition, by combining the location information and the replacement probability, the convergence of the Cache resource and the corresponding core in the physical layout is ensured, and the access delay is reduced. At the same time, the proposed method requires only minimal additional hardware costs. The experimental results show that compared with other existing research results, the proposed method can achieve the performance improvement from 6.8% to 22.7% according to different experimental scenarios and comparative objects.