Syllabus - Parallel Computing (IT 803 (D))
Information Technology
Parallel Computing (IT 803 (D))
VIII
Unit I
Introduction
The need for parallelism, Forms of parallelism (SISD, SIMD, MISD, MIMD), Moore's Law and Multi-cores, Fundamentals of Parallel Computers, Communication architecture, Message passing architecture, Data parallel architecture, Dataflow architecture, Systolic architecture, Performance Issues
Unit II
Large Cache Design
Shared vs. Private Caches, Centralized vs. Distributed Shared Caches, Snooping-based cache coherence protocol, directory-based cache coherence protocol, Uniform Cache Access, Non-Uniform Cache Access, D-NUCA, S-NUCA, Inclusion, Exclusion, Difference between transaction and transactional memory, STM, HTM
Unit III
Graphics Processing Unit
GPUs as Parallel Computers, Architecture of a modern GPU, Evolution of Graphics Pipelines, GPGPUs, Scalable GPUs, Architectural characteristics of Future Systems, Implication of Technology and Architecture for users, Vector addition, Applications of GPU
Unit IV
Introduction to Parallel Programming
Strategies, Mechanism, Performance theory, Parallel Programming Patterns: Nesting pattern, Parallel Control Pattern, Parallel Data Management, Map: Scaled Vector, Mandelbrot, Collative: Reduce, Fusing Map and Reduce, Scan, Fusing Map and Scan, Data Recognition: Gather, Scatter, Pack , Stencil and Recurrence, Fork-Join, Pipeline
Unit V
Parallel Programming Languages
Distributed Memory Programming with MPI: trapezoidal rule in MPI, I/O handling, MPI derived datatype, Collective Communication, Shared Memory Programming with Pthreads: Conditional Variables, read-write locks, Cache handling, Shared memory programming with Open MP: Parallel for directives, scheduling loops, Thread Safety, CUDA: Parallel programming in CUDA C, Thread management, Constant memory and Event, Graphics Interoperability, Atomics, Streams
Course Objective
To develop an understanding of the fundamental principles and engineering trade-offs involved in designing modern parallel computers and to develop programming skills to effectively implement parallel architecture
Course Outcome
["To develop an understanding of various basic concepts associated with parallel computing environments", "Understand, appreciate and apply parallel and distributed algorithms in problem solving", "Acquire skills to measure the performance of parallel and distributed programs", "Design parallel programs to enhance machine performance in parallel hardware environment", "Design and implement parallel programs in modern environments such as CUDA, OpenMP, etc"]
Practicals
Reference Books
-
D. E. Culler, J. P. Singh, and A. Gupta, “Parallel Computer Architecture”, MorganKaufmann, 2004
-
Rajeev Balasubramonian, Norman P. Jouppi, and Naveen Muralimanohar, “Multi-Core Cache Hierarchies”, Morgan & Claypool Publishers, 2011
-
Peter and Pach Eco, “An Introduction to Parallel Programming”, Elsevier, 2011
-
James R. Larus and Ravi Rajwar, “Transactional Memory”, Morgan & Claypool Publishers, 2007
-
David B. Kirk, Wen-mei W. Hwu, “Programming Massively Parallel Processors: A Hands-on Approach”, 2010
-
Barbara Chapman, F. Desprez, Gerhard R. Joubert, Alain Lichnewsky, Frans Peters “Parallel Computing: From Multicores and GPU's to Petascale”, 2010
-
Michael McCool, James Reinders, Arch Robison, “Structured Parallel Programming: Patterns for Efficient Computation”, 2012
-
Jason Sanders, Edward Kandrot, “CUDA by Example: An Introduction to GeneralPurpose GPU Programming”, 2011