Multi-Core Cache Hierarchies.pdf

发布时间：2022-05-29 发布人：admin 分类：说明书资料大小：1.53M 资料格式：pdf 举报版权申诉

reasly168-10107676-4744300845154148508.pdf-第1页.png

第1页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第2页.png

第2页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第3页.png

第3页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第4页.png

第4页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第5页.png

第5页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第6页.png

第6页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第7页.png

第7页 / 共155页

reasly168-10107676-4744300845154148508.pdf-第8页.png

第8页 / 共155页

Preface

Acknowledgments

Basic Elements of Large Cache Design

Shared Vs. Private Caches

Shared LLC

Private LLC

Workload Analysis

Centralized Vs. Distributed Shared Caches

Non-Uniform Cache Access

Inclusion

Organizing Data in CMP Last Level Caches

Data Management for a Large Shared NUCA Cache

Placement/Migration/Search Policies for D-NUCA

Replication Policies in Shared Caches

OS-based Page Placement

Data Management for a Collection of Private Caches

Discussion

Policies Impacting Cache Hit Rates

Cache Partitioning for Throughput and Quality-of-Service

Introduction

Throughput

QoS Policies

Selecting a Highly Useful Population for a Large Shared Cache

Replacement/Insertion Policies

Novel Organizations for Associativity

Block-Level Optimizations

Summary

Interconnection Networks within Large Caches

Basic Large Cache Design

Cache Array Design

Cache Interconnects

Packet-Switched Routed Networks

The Impact of Interconnect Design on NUCA and UCA Caches

NUCA Caches

UCA Caches

Innovative Network Architectures for Large Caches

Technology

Static-RAM Limitations

Parameter Variation

Modeling Methodology

Mitigating the Effects of Process Variation

Tolerating Hard and Soft Errors

Leveraging 3D Stacking to Resolve SRAM Problems

Emerging Technologies

3T1D RAM

Embedded DRAM

Non-Volatile Memories

Concluding Remarks

Bibliography

Authors' Biographies

fm.pdf

Preface

Acknowledgments

Basic Elements of Large Cache Design

Shared Vs. Private Caches

Shared LLC

Private LLC

Workload Analysis

Centralized Vs. Distributed Shared Caches

Non-Uniform Cache Access

Inclusion

Organizing Data in CMP Last Level Caches

Data Management for a Large Shared NUCA Cache

Placement/Migration/Search Policies for D-NUCA

Replication Policies in Shared Caches

OS-based Page Placement

Data Management for a Collection of Private Caches

Discussion

Policies Impacting Cache Hit Rates

Cache Partitioning for Throughput and Quality-of-Service

Introduction

Throughput

QoS Policies

Selecting a Highly Useful Population for a Large Shared Cache

Replacement/Insertion Policies

Novel Organizations for Associativity

Block-Level Optimizations

Summary

Interconnection Networks within Large Caches

Basic Large Cache Design

Cache Array Design

Cache Interconnects

Packet-Switched Routed Networks

The Impact of Interconnect Design on NUCA and UCA Caches

NUCA Caches

UCA Caches

Innovative Network Architectures for Large Caches

Technology

Static-RAM Limitations

Parameter Variation

Modeling Methodology

Mitigating the Effects of Process Variation

Tolerating Hard and Soft Errors

Leveraging 3D Stacking to Resolve SRAM Problems

Emerging Technologies

3T1D RAM

Embedded DRAM

Non-Volatile Memories

Concluding Remarks

Bibliography

Authors' Biographies

Series ISSN: 1935-3235 Series ISSN: 1935-3235 Series ISSN: 1935-3235 SYNTHESIS LECTURES ON SYNTHESIS LECTURES ON SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE COMPUTER ARCHITECTURE COMPUTER ARCHITECTURE Series Editor: Mark D. Hill, University of Wisconsin Series Editor: Mark D. Hill, University of Wisconsin Series Editor: Mark D. Hill, University of Wisconsin Multi-Core Cache Hierarchies Multi-Core Cache Hierarchies Multi-Core Cache Hierarchies Rajeev Balasubramonian, University of Utah Rajeev Balasubramonian, University of Utah Rajeev Balasubramonian, University of Utah Norman Jouppi, HP Labs Norman Jouppi, HP Labs Norman Jouppi, HP Labs Naveen Muralimanohar, HP Labs Naveen Muralimanohar, HP Labs Naveen Muralimanohar, HP Labs A key determinant of overall system performance and power dissipation is the cache hierarchy A key determinant of overall system performance and power dissipation is the cache hierarchy A key determinant of overall system performance and power dissipation is the cache hierarchy since access to off-chip memory consumes many more cycles and energy than on-chip since access to off-chip memory consumes many more cycles and energy than on-chip since access to off-chip memory consumes many more cycles and energy than on-chip accesses. In addition, multi-core processors are expected to place ever higher bandwidth accesses. In addition, multi-core processors are expected to place ever higher bandwidth accesses. In addition, multi-core processors are expected to place ever higher bandwidth demands on the memory system. All these issues make it important to avoid off-chip memory demands on the memory system. All these issues make it important to avoid off-chip memory demands on the memory system. All these issues make it important to avoid off-chip memory access by improving the efficiency of the on-chip cache. Future multi-core processors will access by improving the efficiency of the on-chip cache. Future multi-core processors will access by improving the efficiency of the on-chip cache. Future multi-core processors will have many large cache banks connected by a network and shared by many cores. Hence, have many large cache banks connected by a network and shared by many cores. Hence, have many large cache banks connected by a network and shared by many cores. Hence, many important problems must be solved: cache resources must be allocated across many many important problems must be solved: cache resources must be allocated across many many important problems must be solved: cache resources must be allocated across many cores, data must be placed in cache banks that are near the accessing core, and the most cores, data must be placed in cache banks that are near the accessing core, and the most cores, data must be placed in cache banks that are near the accessing core, and the most important data must be identified for retention. Finally, difficulties in scaling existing important data must be identified for retention. Finally, difficulties in scaling existing important data must be identified for retention. Finally, difficulties in scaling existing technologies require adapting to and exploiting new technology constraints. technologies require adapting to and exploiting new technology constraints. technologies require adapting to and exploiting new technology constraints. The book attempts a synthesis of recent cache research that has focused on innovations The book attempts a synthesis of recent cache research that has focused on innovations The book attempts a synthesis of recent cache research that has focused on innovations for multi-core processors. It is an excellent starting point for early-stage graduate students, for multi-core processors. It is an excellent starting point for early-stage graduate students, for multi-core processors. It is an excellent starting point for early-stage graduate students, researchers, and practitioners who wish to understand the landscape of recent cache research. researchers, and practitioners who wish to understand the landscape of recent cache research. researchers, and practitioners who wish to understand the landscape of recent cache research. The book is suitable as a reference for advanced computer architecture classes as well as for The book is suitable as a reference for advanced computer architecture classes as well as for The book is suitable as a reference for advanced computer architecture classes as well as for experienced researchers and VLSI engineers. experienced researchers and VLSI engineers. experienced researchers and VLSI engineers. B B B A A A L L L A A A S S S U U U B B B R R R A A A M M M O O O N N N I I I A A A N N N • J • J • J O O O U U U P P P P P P I I I • • • M M M U U U R R R A A A L L L I I I M M M A A A N N N O O O H H H A A A R R R M M M U U U L L L T T T I I I - - - C C C O O O R R R E E E C C C A A A C C C H H H E E E H H H I I I E E E R R R A A A R R R C C C H H H I I I E E E S S S & & & CM& Morgan Claypool Publishers CM& Morgan Claypool Publishers CM& Morgan Claypool Publishers Multi-Core Cache Multi-Core Cache Multi-Core Cache Hierarchies Hierarchies Hierarchies Rajeev Balasubramonian Rajeev Balasubramonian Rajeev Balasubramonian Norman Jouppi Norman Jouppi Norman Jouppi Naveen Muralimanohar Naveen Muralimanohar Naveen Muralimanohar About SYNTHESIs About SYNTHESIs About SYNTHESIs This volume is a printed version of a work that appears in the Synthesis This volume is a printed version of a work that appears in the Synthesis This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and Computer Science. Synthesis Lectures Digital Library of Engineering and Computer Science. Synthesis Lectures Digital Library of Engineering and Computer Science. Synthesis Lectures provide concise, original presentations of important research and development provide concise, original presentations of important research and development provide concise, original presentations of important research and development topics, published quickly, in digital and print formats. For more information topics, published quickly, in digital and print formats. For more information topics, published quickly, in digital and print formats. For more information visit www.morganclaypool.com visit www.morganclaypool.com visit www.morganclaypool.com & & & Morgan Claypool Publishers Morgan Claypool Publishers Morgan Claypool Publishers w w w . m o r g a n c l a y p o o l . c o m w w w . m o r g a n c l a y p o o l . c o m w w w . m o r g a n c l a y p o o l . c o m M M M O O O R R R G G G A A A N N N & & & C C C L L L A A A Y Y Y P P P O O O O O O L L L ISBN: 978-1-59829-753-9 ISBN: 978-1-59829-753-9 ISBN: 978-1-59829-753-9 90000 90000 90000 9 781598 297539 9 781598 297539 9 781598 297539 SYNTHESIS LECTURES ON SYNTHESIS LECTURES ON SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE COMPUTER ARCHITECTURE COMPUTER ARCHITECTURE Mark D. Hill, Series Editor Mark D. Hill, Series Editor Mark D. Hill, Series Editor

Multi-Core Cache Hierarchies

Synthesis Lectures on Computer Architecture Editor Mark D. Hill, University of Wisconsin Synthesis Lectures on Computer Architecture publishes 50- to 100-page publications on topics pertaining to the science and art of designing, analyzing, selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals. The scope will largely follow the purview of premier computer architecture conferences, such as ISCA, HPCA, MICRO, and ASPLOS. Multi-Core Cache Hierarchies Rajeev Balasubramonian, Norman P. Jouppi, and Naveen Muralimanohar 2011 A Primer on Memory Consistency and Cache Coherence Daniel J. Sorin, Mark D. Hill, and David A. Wood 2011 Dynamic Binary Modiﬁcation: Tools, Techniques, and Applications Kim Hazelwood 2011 Quantum Computing for Computer Architects, Second Edition Tzvetan S. Metodi, Arvin I. Faruque, and Frederic T. Chong 2011 High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities Dennis Abts and John Kim 2011 Processor Microarchitecture: An Implementation Perspective Antonio González, Fernando Latorre, and Grigorios Magklis 2010 Transactional Memory, 2nd edition Tim Harris, James Larus, and Ravi Rajwar 2010

iii Computer Architecture Performance Evaluation Methods Lieven Eeckhout 2010 Introduction to Reconﬁgurable Supercomputing Marco Lanzagorta, Stephen Bique, and Robert Rosenberg 2009 On-Chip Networks Natalie Enright Jerger and Li-Shiuan Peh 2009 The Memory System: You Can’t Avoid It, You Can’t Ignore It, You Can’t Fake It Bruce Jacob 2009 Fault Tolerant Computer Architecture Daniel J. Sorin 2009 The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines free access Luiz André Barroso and Urs Hölzle 2009 Computer Architecture Techniques for Power-Efﬁciency Stefanos Kaxiras and Margaret Martonosi 2008 Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency Kunle Olukotun, Lance Hammond, and James Laudon 2007 Transactional Memory James R. Larus and Ravi Rajwar 2006 Quantum Computing for Computer Architects Tzvetan S. Metodi and Frederic T. Chong 2006

Copyright © 2011 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Multi-Core Cache Hierarchies Rajeev Balasubramonian, Norman P. Jouppi, and Naveen Muralimanohar www.morganclaypool.com ISBN: 9781598297539 ISBN: 9781598297546 paperback ebook DOI 10.2200/S00365ED1V01Y201105CAC017 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE Lecture #17 Series Editor: Mark D. Hill, University of Wisconsin Series ISSN Synthesis Lectures on Computer Architecture Print 1935-3235 Electronic 1935-3243

Multi-Core Cache Hierarchies Rajeev Balasubramonian University of Utah Norman P. Jouppi HP Labs Naveen Muralimanohar HP Labs SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE #17 CM& Morgan & cLaypool publishers

ABSTRACT A key determinant of overall system performance and power dissipation is the cache hierarchy since access to off-chip memory consumes many more cycles and energy than on-chip accesses. In addition, multi-core processors are expected to place ever higher bandwidth demands on the memory system. All these issues make it important to avoid off-chip memory access by improving the efﬁciency of the on-chip cache. Future multi-core processors will have many large cache banks connected by a network and shared by many cores. Hence, many important problems must be solved: cache resources must be allocated across many cores, data must be placed in cache banks that are near the accessing core, and the most important data must be identiﬁed for retention. Finally, difﬁculties in scaling existing technologies require adapting to and exploiting new technology constraints. The book attempts a synthesis of recent cache research that has focused on innovations for multi-core processors. It is an excellent starting point for early-stage graduate students, researchers, practitioners who wish to understand the landscape of recent cache research. The book is suitable as a reference for advanced computer architecture classes as well as for experienced researchers and VLSI engineers. KEYWORDS computer architecture, multi-core processors, cache hierarchies, shared and private caches, non-uniform cache access (NUCA), quality-of-service, cache partitions, re- placement policies, memory prefetch, on-chip networks, memory cells.

分享到：

赞收藏

资料库

Multi-Core Cache Hierarchies.pdf

相关推荐

开发技术

热门标签

最新资料