Syllabus
Unit 1
Multi-core, Many-core, and GPU Architecture: Introduction – Overview of multi-core, many-core processors, and their evolution, Introduction to GPU Architecture: Basics, differences from traditional CPUs, Massive parallelism in GPUs and the role of CUDA cores, streaming multiprocessors, System Calls and System Structures – Overview of system calls relevant to parallel programming, Kernel-level system structures in multicore and GPU systems.
Unit 2
Parallel Computing Models and Programming: Introduction to Parallel Computing, Theory of parallelization, models of parallel computation (shared-memory, distributed memory), Design and analysis of parallel algorithms, Parallel Programming Language Concepts – introduction to OpenMP, OpenCL, MPI, Introduction to CUDA programming: basic syntax, memory management, execution model – Thread, SIMD Programming, and Data-Parallel Programming: Concepts of thread-level parallelism, SIMD (Single Instruction, Multiple Data) programming for efficient multicore computing, Data-parallel programming techniques for high-performance applications, GPU Programming with OpenCL and/or CUDA – Hands-on introduction to CUDA programming for GPU, OpenCL programming: comparison with CUDA and use in heterogeneous systems, Memory hierarchy in GPU and optimization techniques for performance.
Unit 3
Advanced Topics in Parallel Algorithms and Optimization: Non-blocking Synchronization – Techniques for non-blocking synchronization in multicore and many-core systems, Atomic operations, locks, and lock-free programming, Scheduling and Operating System Issues for Multicore – Scheduling algorithms in multicore systems, Operating system support for parallelism and performance, Introduction to Heterogeneous Multicore and Parallel DSP Architectures – Exploring heterogeneous multicore systems and how they combine CPUs and GPUs, Concepts of parallel DSP (Digital Signal Processing) architecture and its applications, Programming techniques for heterogeneous multicore systems.
Objectives and Outcomes
Course Objectives
- To introduce the fundamentals the architecture and organization of multi-core processors and GPUs
- To introduce the Parallel Computing Models and Programming Languages like CUDA
- To provide the knowledge of Parallel Algorithms, Data Structures and Performance Optimization
Course Outcomes
At the end of the course, the student should be able to
CO1: understand the Multi-core and GPU Architectures and Parallel Computing Models
CO2: understand parallel programming languages, algorithm, models and data structures
CO3: optimize the performance using different Parallel Programming Patterns and debugging
CO4: work on the scalability and efficiency of a system in Multi-core GPU and GPGPU architectures
CO – PO Mapping
PO/PSO |
PO1 |
PO2 |
PO3 |
PO4 |
PO5 |
PO6 |
PO7 |
PO8 |
PO9 |
PO10 |
PO11 |
PO12 |
PSO1 |
PSO2 |
PSO3 |
CO |
CO1 |
3 |
2 |
|
|
|
|
|
|
|
|
|
2 |
|
|
3 |
CO2 |
3 |
2 |
|
|
|
|
|
|
|
|
|
2 |
|
|
3 |
CO3 |
3 |
2 |
2 |
|
|
|
|
|
|
|
|
2 |
|
|
3 |
CO4 |
3 |
2 |
2 |
|
|
|
|
|
|
|
|
2 |
|
|
3 |
*pso2 only for cce