Date and Time
The course will be held online on February 9th, from 9 am to 5 pm.
Prerequisites
- Professional experience programming CUDA C/C++ applications, including the use of the NVCC compiler, kernel launches, grid-stride loops, host-to-device and device-to-host memory transfers, and CUDA error handling.
- Familiarity with the Linux command line.
- Experience using Makefiles to compile C/C++ code.
- A free NVIDIA developer account is required to access the course material. Please register prior to the training at https://courses.nvidia.com/join/.
Learning Objectives
At the conclusion of the workshop, you will be able to:
- Use several methods for writing multi-GPU CUDA C++ applications,
- Use a variety of multi-GPU communication patterns and understand their tradeoffs,
- Write portable, scalable CUDA code with the single-program multiple-data (SPMD) paradigm using CUDA-aware MPI and NVSHMEM,
- Improve multi-GPU SPMD code with NVSHMEM’s symmetric memory model and its ability to perform GPU-initiated data transfers, and
- Get practice with common multi-GPU coding paradigms like domain decomposition and halo exchanges.
Certification
Upon successful completion of all course assessments, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.
Structure
Module 1 -- Multi-GPU Programming Paradigms
Survey multiple techniques for programming CUDA C++ applications for multiple GPUs using a Monte-Carlo approximation of Pi CUDA C++ program.
- Use CUDA to utilize multiple GPUs.
- Learn how to enable and use direct peer-to-peer memory communication.
- Write an SPMD version with CUDA-aware MPI.
Module 2 -- Introduction to NVSHMEM
Learn how to write code with NVSHMEM and understand its symmetric memory model.
- Use NVSHMEM to write SPMD code for multiple GPUs.
- Utilize symmetric memory to let all GPUs access data on other GPUs.
- Make GPU-initiated memory transfers.
Module 3 -- Halo Exchanges with NVSHMEM
Practice common coding motifs like halo exchanges and domain decomposition using NVSHMEM, and work on the assessment.
- Write an NVSHMEM implementation of a Laplace equation Jacobi solver.
- Refactor a single GPU 1D wave equation solver with NVSHMEM.
- Complete the assessment and earn a certificate.
Program
The program can be found here.
Language
The course will be held in English.
Instructor
Dr. Sebastian Kuckuk, certified NVIDIA DLI Ambassador.
The course is co-organised by NHR@FAU and the NVIDIA Deep Learning Institute (DLI).
Prices and Eligibility
The course is internal and only open to members of NHR@FAU and the chair for computer science 10, FAU.
Withdrawal Policy
Please only register for the course if you are really going to attend. No-shows will be blacklisted and excluded from future events. If you want to withdraw your registration, please send an e-mail to sebastian.kuckuk@fau.de.
Wait List
To be added to the wait list after the course has reached its maximum number of registrations send an e-mail to sebastian.kuckuk@fau.de with your name and university affiliation.