A key determinant of overall system performance and power dissipation is the cache hierarchy since access to off-chip memory consumes many more cycles and energy than on-chip accesses. In addition, multi-core processors are expected to place ever higher bandwidth demands on the memory system. All these issues make it important to avoid off-chip memory access by improving the efficiency of the on-chip cache. Future multi-core processors will have many large cache banks connected by a network and shared by many cores. Hence, many important problems must be solved: cache resources must be allocated across many cores, data must be placed in cache banks that are near the accessing core, and the most important data must be identified for retention. Finally, difficulties in scaling existing technologies require adapting to and exploiting new technology constraints. The book attempts a synthesis of recent cache research that has focused on innovations for multi-core processors. It is an excellent starting point for early-stage graduate students, researchers, and practitioners who wish to understand the landscape of recent cache research. The book is suitable as a reference for advanced computer architecture classes as well as for experienced researchers and VLSI engineers. Table of Contents: Basic Elements of Large Cache Design / Organizing Data in CMP Last Level Caches / Policies Impacting Cache Hit Rates / Interconnection Networks within Large Caches / Technology / Concluding Remarks
Cited By
- Wu Q, Zhai X, Liu X, Wu C, Lou F and Zhang H (2022). Performance Tuning via Lean Measurements for Acceleration of Network Functions Virtualization, IEEE/ACM Transactions on Networking, 31:1, (366-379), Online publication date: 1-Feb-2023.
- (2019). COBRA-HPA, International Journal of Grid and Utility Computing, 10:2, (105-118), Online publication date: 1-Jan-2019.
- Hu S, Shi F, Ji W, Chen X and Talpur S (2017). Exploring grouped coherence for clustered hierarchical cache, The Journal of Supercomputing, 73:9, (4137-4157), Online publication date: 1-Sep-2017.
- Das S and Kapoor H (2017). Dynamic Associativity Management in Tiled CMPs by Runtime Adaptation of Fellow Sets, IEEE Transactions on Parallel and Distributed Systems, 28:8, (2229-2243), Online publication date: 1-Aug-2017.
- Das S and Kapoor H (2016). A Framework for Block Placement, Migration, and Fast Searching in Tiled-DNUCA Architecture, ACM Transactions on Design Automation of Electronic Systems, 22:1, (1-26), Online publication date: 28-Dec-2016.
- Joshi A, Vollala S, Begum B and Ramasubramanian N Performance Analysis of Cache Coherence Protocols for Multi-core Architectures Proceedings of the International Conference on Advances in Information Communication Technology & Computing, (1-7)
- Das S and Kapoor H Dynamic associativity enabled DNUCA to improve block localisation in tiled CMPs Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1745-1750)
- Chakraborty S, Das S and Kapoor H Static energy efficient cache reconfiguration for dynamic NUCA in tiled CMPs Proceedings of the 31st Annual ACM Symposium on Applied Computing, (1739-1744)
- Kim M, Choi J, Kwak J, Jhang S and Jhon C Bypassing method for STT-RAM based inclusive last-level cache Proceedings of the 2015 Conference on research in adaptive and convergent systems, (424-429)
- Cheng H, Poremba M, Shahidi N, Stalev I, Irwin M, Kandemir M, Sampson J and Xie Y (2015). EECache, ACM Transactions on Architecture and Code Optimization, 12:2, (1-22), Online publication date: 8-Jul-2015.
- Das S and Kapoor H Dynamic associativity management using utility based way-sharing Proceedings of the 30th Annual ACM Symposium on Applied Computing, (1919-1924)
- Kwon W, Krishna T and Peh L (2014). Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs, ACM SIGARCH Computer Architecture News, 42:1, (715-728), Online publication date: 5-Apr-2014.
- Kwon W, Krishna T and Peh L (2014). Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs, ACM SIGPLAN Notices, 49:4, (715-728), Online publication date: 5-Apr-2014.
- Kwon W, Krishna T and Peh L Locality-oblivious cache organization leveraging single-cycle multi-hop NoCs Proceedings of the 19th international conference on Architectural support for programming languages and operating systems, (715-728)
- Zang W and Gordon-Ross A (2013). A survey on cache tuning from a power/energy perspective, ACM Computing Surveys, 45:3, (1-49), Online publication date: 1-Jun-2013.
- Andrade D, Fraguela B and Doallo R (2013). Accurate prediction of the behavior of multithreaded applications in shared caches, Parallel Computing, 39:1, (36-57), Online publication date: 1-Jan-2013.
Index Terms
- Multi-Core Cache Hierarchies
Recommendations
Characteristics of performance-optimal multi-level cache hierarchies
Special Issue: Proceedings of the 16th annual international symposium on Computer ArchitectureThe increasing speed of new generation processors will exacerbate the already large difference between CPU cycle times and main memory access times. As this difference grows, it will be increasingly difficult to build single-level caches that are both ...
Characteristics of performance-optimal multi-level cache hierarchies
ISCA '89: Proceedings of the 16th annual international symposium on Computer architectureThe increasing speed of new generation processors will exacerbate the already large difference between CPU cycle times and main memory access times. As this difference grows, it will be increasingly difficult to build single-level caches that are both ...
Performance evaluation of exclusive cache hierarchies
ISPASS '04: Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and SoftwareMemory hierarchy performance, specifically cache memory capacity, is a constraining factor in the performance of modern computers. This paper presents the results of two-level cache memory simulations and examines the impact of exclusive caching on ...