Heterogeneous Computing with OpenCL 2011.pdf

发布时间：2022-06-14 发布人：admin 分类：说明书资料大小：5.21M 资料格式：pdf 举报版权申诉

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第1页.png

第1页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第2页.png

第2页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第3页.png

第3页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第4页.png

第4页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第5页.png

第5页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第6页.png

第6页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第7页.png

第7页 / 共295页

b9216e23-146d-4f3a-b0e9-f2ecef9ffae0.pdf-第8页.png

第8页 / 共295页

Front Cover

HeterogeneousComputing with OpenCL

Contents

Foreword

Preface

Our Heterogeneous World

OpenCL

This Text

Acknowledgments

About the Authors

Chapter 1: Introduction to Parallel Programming

Introduction

OpenCL

The Goals of This Book

Thinking Parallel

Concurrency and Parallel Programming Models

Structure

Reference

Further Reading and Relevant Websites

Index

Heterogeneous Computing with OpenCL

intentionally left blank

Heterogeneous Computing with OpenCL Benedict Gaster Lee Howes David R. Kaeli Perhaad Mistry Dana Schaa

Acquiring Editor: Todd Green Development Editor: Robyn Day Project Manager: Andre´ Cuello Designer: Joanne Blank Morgan Kaufmann is an imprint of Elsevier 225 Wyman Street, Waltham, MA 02451, USA # 2012 Advanced Micro Devices, Inc. Published by Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrange- ments with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of product liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data Heterogeneous computing with OpenCL / Benedict Gaster ... [et al.]. p. cm. ISBN 978-0-12-387766-6 1. Parallel programming (Computer science) 2. OpenCL (Computer program language) I. Gaster, Benedict. QA76.642.H48 2012 005.2’752–dc23 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. 2011020169 ISBN: 978-0-12-387766-6 For information on all MK publications visit our website at www.mkp.com Printed in the United States of America 12 13 14 15 10 9 8 7 6 5 4 3 2 1

Contents Foreword ............................................................................................................... vii Preface .................................................................................................................... xi Acknowledgments ............................................................................................... xiii About the Authors ................................................................................................. xv CHAPTER 1 CHAPTER 2 CHAPTER 3 CHAPTER 4 CHAPTER 5 CHAPTER 6 CHAPTER 7 CHAPTER 8 CHAPTER 9 Introduction to Parallel Programming.................................. 1 Introduction to OpenCL..................................................... 15 OpenCL Device Architectures............................................ 41 Basic OpenCL Examples ................................................... 67 Understanding OpenCL’s Concurrency and Execution Model.............................................................. 87 Dissecting a CPU/GPU OpenCL Implementation ................ 123 OpenCL Case Study: Convolution..................................... 151 OpenCL Case Study: Video Processing ............................ 173 OpenCL Case Study: Histogram ....................................... 185 CHAPTER 10 OpenCL Case Study: Mixed Particle Simulation................ 197 CHAPTER 11 OpenCL Extensions......................................................... 211 CHAPTER 12 OpenCL Profiling and Debugging ..................................... 235 CHAPTER 13 WebCL .......................................................................... 255 This special section contributed by Jari Nikara, Tomi Aarnio, Eero Aho, and Janne Pietia¨inen v

intentionally left blank

Foreword For more than two decades, the computer industry has been inspired and motivated by the observation made by Gordon Moore (A.K.A “Moore’s law”) that the density of transistors on die was doubling every 18 months. This observation created the an- ticipation that the performance a certain application achieves on one generation of processors will be doubled within two years when the next generation of processors will be announced. Constant improvement in manufacturing and processor technol- ogies was the main drive of this trend since it allowed any new processor generation to shrink all the transistor’s dimensions within the “golden factor”, 0.3 (ideal shrink) and to reduce the power supply accordingly. Thus, any new processor generation could double the density of transistors, to gain 50% speed improvement (frequency) while consuming the same power and keeping the same power density. When better performance was required, computer architects were focused on using the extra tran- sistors for pushing the frequency beyond what the shrink provided, and for adding new architectural features that mainly aim at gaining performance improvement for existing and new applications. During the mid 2000s, the transistor size became so small that the “physics of small devices” started to govern the characterization of the entire chip. Thus fre- quency improvement and density increase could not be achieved anymore without a significant increase of power consumption and of power density. A recent report by the International Technology Roadmap for Semiconductors (ITRS) supports this observation and indicates that this trend will continue for the foreseeable future and it will most likely become the most significant factor affecting technology scaling and the future of computer based system. To cope with the expectation of doubling the performance every known period of time (not 2 years anymore), two major changes happened (1) instead of increasing the frequency, modern processors increase the number of cores on each die. This trend forces the software to be changed as well. Since we cannot expect the hardware to achieve significantly better performance for a given application anymore, we need to develop new implementations for the same application that will take advantage of the multicore architecture, and (2) thermal and power become first class citizens with any design of future architecture. These trends encourage the community to start looking at heterogeneous solutions: systems which are assembled from different sub- systems, each of them optimized to achieve different optimization points or to ad- dress different workloads. For example, many systems combine “traditional” CPU architecture with special purpose FPGAs or Graphics Processors (GPUs). Such an integration can be done at different levels; e.g., at the system level, at the board level and recently at the core level. Developing software for homogeneous parallel and distributed systems is consid- ered to be a non-trivial task, even though such development uses well-known para- digms and well established programming languages, developing methods, algorithms, debugging tools, etc. Developing software to support general-purpose vii

分享到：

赞收藏

资料库

Heterogeneous Computing with OpenCL 2011.pdf

相关推荐

开发技术

热门标签

最新资料