Designed to be both a comprehensive reference and a practical cookbook, the text is divided into the following three parts: Part I, Overview, gives high-level descriptions of the hardware and software that make CUDA possible.
Part II, Details, provides thorough descriptions of every aspect of CUDA, including
Memory Streams and events Models of execution, including the dynamic parallelism feature, new with CUDA 5.0 and SM 3.5 The streaming multiprocessors, including descriptions of all features through SM 3.5 Programming multiple GPUs Texturing The source code accompanying Part II is presented as reusable microbenchmarks and microdemos, designed to expose specific hardware characteristics or highlight specific use cases.
Part III, Select Applications, details specific families of CUDA applications and key parallel algorithms, including Streaming workloads
Reduction Parallel prefix sum (Scan)
N-body Image Processing These algorithms cover the full range of potential CUDA applications.