Using the new OpenCL (Open Computing Language) standard, you can write applications that access all available programming resources: CPUs, GPUs, and other processors such as DSPs and the Cell/B.E. processor. Already implemented by Apple, AMD, Intel, IBM, NVIDIA, and other leaders, OpenCL has outstanding potential for PCs, servers, handheld/embedded devices, high performance computing, and even cloud systems. This is the first comprehensive, authoritative, and practical guide to OpenCL 1.1 specifically for working developers and software architects.
Written by five leading OpenCL authorities, OpenCL Programming Guide covers the entire specification. It reviews key use cases, shows how OpenCL can express a wide range of parallel algorithms, and offers complete reference material on both the API and OpenCL C programming language.
Through complete case studies and downloadable code examples, the authors show how to write complex parallel programs that decompose workloads across many different devices. They also present all the essentials of OpenCL software performance optimization, including probing and adapting to hardware. Coverage includes
- Understanding OpenCL's architecture, concepts, terminology, goals, and rationale
- Programming with OpenCL C and the runtime API
- Using buffers, sub-buffers, images, samplers, and events
- Sharing and synchronizing data with OpenGL and Microsoft's Direct3D
- Simplifying development with the C++ Wrapper API
- Using OpenCL Embedded Profiles to support devices ranging from cellphones to supercomputer nodes
- Case studies dealing with physics simulation; image and signal processing, such as image histograms, edge detection filters, Fast Fourier Transforms, and optical flow; math libraries, such as matrix multiplication and high-performance sparse matrix multiplication; and more
- Source code for this book is available at https://code.google.com/p/
opencl-book-samples/
Autorentext
Aaftab Munshi is the spec editor for the OpenGL ES 1.1, OpenGL ES 2.0, and OpenCL specifications and coauthor of the book OpenGL ES 2.0 Programming Guide (with Dan Ginsburg and Dave Shreiner, published by Addison-Wesley, 2008). He currently works at Apple.
Benedict R. Gaster is a software architect working on programming models for next-generation heterogeneous processors, in particular looking at high-level abstractions for parallel programming on the emerging class of processors that contain both CPUs and accelerators such as GPUs. Benedict has contributed extensively to the OpenCL's design and has represented AMD at the Khronos Group open standard consortium. Benedict has a Ph.D. in computer science for his work on type systems for extensible records and variants. He has been working at AMD since 2008.
Timothy G. Mattson is an old-fashioned parallel programmer, having started in the mid-eighties with the Caltech Cosmic Cube and continuing to the present. Along the way, he has worked with most classes of parallel computers (vector supercomputers, SMP, VLIW, NUMA, MPP, clusters, and many-core processors). Tim has published extensively, including the books Patterns for Parallel Programming (with Beverly Sanders and Berna Massingill, published by Addison-Wesley, 2004) and An Introduction to Concurrency in Programming Languages (with Matthew J. Sottile and Craig E. Rasmussen, published by CRC Press, 2009). Tim has a Ph.D. in chemistry for his work on molecular scattering theory. He has been working at Intel since 1993.
James Fung has been developing computer vision on the GPU as it progressed from graphics to general-purpose computation. James has a Ph.D. in electrical and computer engineering from the University of Toronto and numerous IEEE and ACM publications in the areas of parallel GPU Computer Vision and Mediated Reality. He is currently a Developer Technology Engineer at NVIDIA, where he examines computer vision and image processing on graphics hardware.
Dan Ginsburg currently works at Children's Hospital Boston as a Principal Software Architect in the Fetal-Neonatal Neuroimaging and Development Science Center, where he uses OpenCL for accelerating neuroimaging algorithms. Previously, he worked for Still River Systems developing GPU-accelerated image registration software for the Monarch 250 proton beam radiotherapy system. Dan was also Senior Member of Technical Staff at AMD, where he worked for over eight years in a variety of roles, including developing OpenGL drivers, creating desktop and hand-held 3D demos, and leading the development of handheld GPU developer tools. Dan holds a B.S. in computer science from Worcester Polytechnic Institute and an M.B.A. from Bentley University.
Inhalt
Figures xv
Tables xxi
Listings xxv
Foreword xxix
Preface xxxiii
Acknowledgments xli
About the Authors xliii
Part I: The OpenCL 1.1 Language and API 1
Chapter 1: An Introduction to OpenCL 3
What Is OpenCL, or . . . Why You Need This Book 3
Our Many-Core Future: Heterogeneous Platforms 4
Software in a Many-Core World 7
Conceptual Foundations of OpenCL 11
OpenCL and Graphics 29
The Contents of OpenCL 30
The Embedded Profile 35
Learning OpenCL 36
Chapter 2: HelloWorld: An OpenCL Example 39
Building the Examples 40
HelloWorld Example 45
Checking for Errors in OpenCL 57
Chapter 3: Platforms, Contexts, and Devices 63
OpenCL Platforms 63
OpenCL Devices 68
OpenCL Contexts 83
Chapter 4: Programming with OpenCL C 97
Writing a Data-Parallel Kernel Using OpenCL C 97
Scalar Data Types 99
Vector Data Types 102
Other Data Types 108
Derived Types 109
Implicit Type Conversions 110
Explicit Casts 116
Explicit Conversions 117
Reinterpreting Data as Another Type 121
Vector Operators 123
Qualifiers 133
Keywords 141
Preprocessor Directives and Macros 141
Restrictions 146
Chapter 5: OpenCL C Built-In Functions 149
Work-Item Functions 150
Math Functions 153
Integer Functions 168
Common Functions 172
Geometric Functions 175
Relational Functions 175
Vector Data Load and Store Functions 181
Synchronization Functions 190
Async Copy and Prefetch Functions 191
Atomic Functions 195
Miscellaneous Vector Functions 199
Image Read and Write Functions 201
Chapter 6: Programs and Kernels 217
Program and Kernel Object Overview 217
Program Objects 218
Kernel Objects 237
Chapter 7: Buffers and Sub-Buffers 247
Memory Objects, Buffers, and Sub-Buffers Overview 247
Creating Buffers and Sub-Buffers 249
Querying Buffers and Sub-Buffers 257
Reading, Writing, and Copying Buffers and Sub-Buffers 259
Mapping Buffers and Sub-Buffers 276
Chapter 8: Images and Samplers 281
Image and Sampler Object Overview 281
Creating Image Objects 283
Creating Sampler Objects 292
OpenCL C Functions for Working with Images 295
Transferring Image Objects 299
Chapter 9: Events 309
Commands, Queues, and Events Overview 309
Events and Command-Queues 311
Event Objects 317
Generating Events on the Host 321
Events Impacting Execution on the Host 322
Using Events for Profiling 327
Events Inside Kernels 332
Events from Outside OpenCL 333
Chapter 10: Interoperability with OpenGL 335
OpenCL/OpenGL Sharing Overview 335
Querying for the OpenGL Shari…