logo资料库

Computer Architecture Performance Evaluation Methods.pdf

第1页 / 共145页
第2页 / 共145页
第3页 / 共145页
第4页 / 共145页
第5页 / 共145页
第6页 / 共145页
第7页 / 共145页
第8页 / 共145页
资料共145页,剩余部分请下载后查看
Preface
Acknowledgments
Introduction
Structure of computer architecture (r)evolution
Importance of performance evaluation
Book outline
Performance Metrics
Single-threaded workloads
Multi-threaded workloads
Multiprogram workloads
System throughput
Average normalized turnaround time
Comparison to prevalent metrics
STP versus ANTT performance evaluation
Average performance
Harmonic and arithmetic average: Mathematical viewpoint
Geometric average: Statistical viewpoint
Final thought on averages
Partial metrics
Workload Design
From workload space to representative workload
Reference workload
Towards a reduced workload
PCA-based workload design
General framework
Workload characterization
Principal component analysis
Cluster analysis
Applications
Plackett and Burman based workload design
Limitations and discussion
Analytical Performance Modeling
Empirical versus mechanistic modeling
Empirical modeling
Linear regression
Non-linear and spline-based regression
Neural networks
Mechanistic modeling: interval modeling
Interval model fundamentals
Modeling I-cache and I-TLB misses
Modeling branch mispredictions
Modeling short back-end miss events using Little's law
Modeling long back-end miss events
Miss event overlaps
The overall model
Input parameters to the model
Predecessors to interval modeling
Follow-on work
Multiprocessor modeling
Hybrid mechanistic-empirical modeling
Simulation
The computer architect's toolbox
Functional simulation
Alternatives
Operating system effects
Full-system simulation
Specialized trace-driven simulation
Trace-driven simulation
Execution-driven simulation
Taxonomy
Dealing with non-determinism
Modular simulation infrastructure
Need for simulation acceleration
Sampled Simulation
What sampling units to select?
Statistical sampling
Targeted Sampling
Comparing design alternatives through sampled simulation
How to initialize architecture state?
Fast-forwarding
Checkpointing
How to initialize microarchitecture state?
Cache state warmup
Predictor warmup
Processor core state
Sampled multiprocessor and multi-threaded processor simulation
Statistical Simulation
Methodology overview
Applications
Single-threaded workloads
Statistical profiling
Synthetic trace generation
Synthetic trace simulation
Multi-program workloads
Multi-threaded workloads
Other work in statistical modeling
Parallel Simulation and Hardware Acceleration
Parallel sampled simulation
Parallel simulation
FPGA-accelerated simulation
Taxonomy
Example projects
Concluding Remarks
Topics that this book did not cover (yet)
Measurement bias
Design space exploration
Simulator validation
Future work in performance evaluation methods
Challenges related to software
Challenges related to hardware
Final comment
Bibliography
Author's Biography
Computer Architecture Performance Evaluation Methods
Copyright © 2010 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Computer Architecture Performance Evaluation Methods Lieven Eeckhout www.morganclaypool.com ISBN: 9781608454679 ISBN: 9781608454686 paperback ebook DOI 10.2200/S00273ED1V01Y201006CAC010 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE Lecture #10 Series Editor: Mark D. Hill, University of Wisconsin Series ISSN Synthesis Lectures on Computer Architecture Print 1935-3235 Electronic 1935-3243
Synthesis Lectures on Computer Architecture Editor Mark D. Hill, University of Wisconsin Synthesis Lectures on Computer Architecture publishes 50- to 100-page publications on topics pertaining to the science and art of designing, analyzing, selecting and interconnecting hardware components to create computers that meet functional, performance and cost goals. The scope will largely follow the purview of premier computer architecture conferences, such as ISCA, HPCA, MICRO, and ASPLOS. Computer Architecture Performance Evaluation Methods Lieven Eeckhout 2010 Introduction to Reconfigured Supercomputing Marco Lanzagorta, Stephen Bique, and Robert Rosenberg 2009 On-Chip Networks Natalie Enright Jerger and Li-Shiuan Peh 2009 The Memory System: You Can’t Avoid It, You Can’t Ignore It, You Can’t Fake It Bruce Jacob 2009 Fault Tolerant Computer Architecture Daniel J. Sorin 2009 The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines Luiz André Barroso and Urs Hölzle 2009 Computer Architecture Techniques for Power-Efficiency Stefanos Kaxiras and Margaret Martonosi 2008
iv Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency Kunle Olukotun, Lance Hammond, and James Laudon 2007 Transactional Memory James R. Larus and Ravi Rajwar 2006 Quantum Computing for Computer Architects Tzvetan S. Metodi and Frederic T. Chong 2006
Computer Architecture Performance Evaluation Methods Lieven Eeckhout Ghent University SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE #10 CM& Morgan & cLaypool publishers
ABSTRACT Performance evaluation is at the foundation of computer architecture research and development. Contemporary microprocessors are so complex that architects cannot design systems based on in- tuition and simple models only. Adequate performance evaluation methods are absolutely crucial to steer the research and development process in the right direction. However, rigorous performance evaluation is non-trivial as there are multiple aspects to performance evaluation, such as picking workloads, selecting an appropriate modeling or simulation approach, running the model and in- terpreting the results using meaningful metrics. Each of these aspects is equally important and a performance evaluation method that lacks rigor in any of these crucial aspects may lead to inaccurate performance data and may drive research and development in a wrong direction. The goal of this book is to present an overview of the current state-of-the-art in computer architecture performance evaluation, with a special emphasis on methods for exploring processor architectures. The book focuses on fundamental concepts and ideas for obtaining accurate perfor- mance data. The book covers various topics in performance evaluation, ranging from performance metrics, to workload selection, to various modeling approaches including mechanistic and empirical modeling. And because simulation is by far the most prevalent modeling technique, more than half the book’s content is devoted to simulation. The book provides an overview of the simulation tech- niques in the computer designer’s toolbox, followed by various simulation acceleration techniques including sampled simulation, statistical simulation, parallel simulation and hardware-accelerated simulation. KEYWORDS computer architecture, performance evaluation, performance metrics, workload charac- terization, analytical modeling, architectural simulation, sampled simulation, statistical simulation, parallel simulation, FPGA-accelerated simulation
Contents vii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv 1 2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 1.2 1.3 Structure of computer architecture (r)evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Importance of performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Book outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Single-threaded workloads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 2.2 Multi-threaded workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Multiprogram workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3.1 System throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3.2 Average normalized turnaround time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.3 Comparison to prevalent metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3.4 STP versus ANTT performance evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Average performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4.1 Harmonic and arithmetic average: Mathematical viewpoint. . . . . . . . . . . . . .12 2.4.2 Geometric average: Statistical viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.4.3 Final thought on averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Partial metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 3 Workload Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 From workload space to representative workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Reference workload. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 3.1.2 Towards a reduced workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 2.5 3.1
viii 3.2 3.3 3.4 PCA-based workload design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18 3.2.1 General framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2.2 Workload characterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.3 Principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.4 Cluster analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Plackett and Burman based workload design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Limitations and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4 Analytical Performance Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1 4.2 Empirical versus mechanistic modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Empirical modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32 4.2.1 Linear regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.2 Non-linear and spline-based regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.2.3 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.3 Mechanistic modeling: interval modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3.1 Interval model fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3.2 Modeling I-cache and I-TLB misses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.3.3 Modeling branch mispredictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3.4 Modeling short back-end miss events using Little’s law . . . . . . . . . . . . . . . . . 41 4.3.5 Modeling long back-end miss events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4.3.6 Miss event overlaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43 4.3.7 The overall model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.8 Input parameters to the model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.3.9 Predecessors to interval modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.10 Follow-on work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.3.11 Multiprocessor modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46 4.4 Hybrid mechanistic-empirical modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 5.1 The computer architect’s toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
分享到:
收藏