[NHR@FAU Internal] [Online] Scaling CUDA C++ Applications to Multiple Nodes

Europe/Berlin
Online

Online

The Zoom link will be provided to registered participants on the day before the event.
Description

Date and Time

The course will be held online on February 9th, from 9 am to 5 pm.

 

Prerequisites

  • Professional experience programming CUDA C/C++ applications, including the use of the NVCC compiler, kernel launches, grid-stride loops, host-to-device and device-to-host memory transfers, and CUDA error handling.
  • Familiarity with the Linux command line.
  • Experience using Makefiles to compile C/C++ code.
  • A free NVIDIA developer account is required to access the course material. Please register prior to the training at https://courses.nvidia.com/join/.

 

Learning Objectives

At the conclusion of the workshop, you will be able to:

  • Use several methods for writing multi-GPU CUDA C++ applications,
  • Use a variety of multi-GPU communication patterns and understand their tradeoffs,
  • Write portable, scalable CUDA code with the single-program multiple-data (SPMD) paradigm using CUDA-aware MPI and NVSHMEM,
  • Improve multi-GPU SPMD code with NVSHMEM’s symmetric memory model and its ability to perform GPU-initiated data transfers, and
  • Get practice with common multi-GPU coding paradigms like domain decomposition and halo exchanges.

 

Certification

Upon successful completion of all course assessments, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.

 

Structure

Module 1 -- Multi-GPU Programming Paradigms

Survey multiple techniques for programming CUDA C++ applications for multiple GPUs using a Monte-Carlo approximation of Pi CUDA C++ program.

  • Use CUDA to utilize multiple GPUs.
  • Learn how to enable and use direct peer-to-peer memory communication.
  • Write an SPMD version with CUDA-aware MPI.

Module 2 -- Introduction to NVSHMEM

Learn how to write code with NVSHMEM and understand its symmetric memory model.

  • Use NVSHMEM to write SPMD code for multiple GPUs.
  • Utilize symmetric memory to let all GPUs access data on other GPUs.
  • Make GPU-initiated memory transfers.

Module 3 -- Halo Exchanges with NVSHMEM

Practice common coding motifs like halo exchanges and domain decomposition using NVSHMEM, and work on the assessment.

  • Write an NVSHMEM implementation of a Laplace equation Jacobi solver.
  • Refactor a single GPU 1D wave equation solver with NVSHMEM.
  • Complete the assessment and earn a certificate.

 

Program

The program can be found here.

 

Language

The course will be held in English.

 

Instructor

Dr. Sebastian Kuckuk, certified NVIDIA DLI Ambassador.

The course is co-organised by NHR@FAU and the NVIDIA Deep Learning Institute (DLI).

 

Prices and Eligibility

The course is internal and only open to members of NHR@FAU and the chair for computer science 10, FAU.

 

Withdrawal Policy

Please only register for the course if you are really going to attend. No-shows will be blacklisted and excluded from future events. If you want to withdraw your registration, please send an e-mail to sebastian.kuckuk@fau.de.

 

Wait List

To be added to the wait list after the course has reached its maximum number of registrations send an e-mail to sebastian.kuckuk@fau.de with your name and university affiliation.

Registration
Registration
9 / 35
    • 9:00 AM 9:15 AM
      Welcome and Introduction 15m
    • 9:15 AM 10:15 AM
      Module 1 -- Multi-GPU Programming Paradigms 1h
    • 10:15 AM 10:30 AM
      Coffee Break 15m
    • 10:30 AM 11:30 AM
      Module 1 continued 1h
    • 11:30 AM 12:30 PM
      Module 2 -- Introduction to NVSHMEM 1h
    • 12:30 PM 1:30 PM
      Lunch Break 1h
    • 1:30 PM 2:30 PM
      Module 2 continued 1h
    • 2:30 PM 2:45 PM
      Coffee Break 15m
    • 2:45 PM 4:45 PM
      Module 3 -- Halo Exchanges with NVSHMEM 2h
    • 4:45 PM 5:00 PM
      Closing 15m