[EUMaster4HPC] [Online] Fundamentals of Accelerated Computing with CUDA C/C++

Name: [EUMaster4HPC] [Online] Fundamentals of Accelerated Computing with CUDA C/C++
Start: 2024-03-04T09:00:00+01:00
End: 2024-03-05T13:00:00+01:00
Location: Online via Zoom

Mar 4, 2024, 9:00 AM → Mar 5, 2024, 1:00 PM Europe/Berlin

Online via Zoom

The course will be held online via Zoom. The participation link will be provided via mail to registered participants on the day before the course.

Description

Date and Time

The course will be held online on March 4th and March 5th, from 9 am to 1 pm.

Prerequisites

Basic C/C++ competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
No previous knowledge of CUDA programming is assumed
A free NVIDIA developer account is required to access the course material. Please register before the training at https://courses.nvidia.com/join/.

Learning Objectives

At the conclusion of the workshop, participants will have an understanding of the fundamental tools and techniques for GPU- accelerating C/C++ applications with CUDA and be able to:

Write code to be executed by a GPU accelerator
Expose and express data and instruction-level parallelism in C/C++ applications using CUDA
Utilize CUDA-managed memory and optimize memory migration using asynchronous prefetching
Leverage command-line and visual profilers to guide your work
Utilize concurrent streams for instruction-level parallelism
Write GPU-accelerated CUDA C/C++ applications, or refactor existing CPU-only applications, using a profile-driven approach

Certification

Upon successful completion of the assessment at the end of the second day, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.

Structure

Module 1 -- Accelerating Applications with CUDA C/C++

Writing, compiling, and running GPU code
Controlling the parallel thread hierarchy
Allocating and freeing memory for the GPU

Module 2 -- Managing Accelerated Application Memory with CUDA C/C++

Profiling CUDA code with the command-line profiler
Details on unified memory
Optimizing unified memory management

Module 3 -- Asynchronous Streaming and Visual Profiling for Accelerated Applications with CUDA C/C++

Profiling CUDA code with NVIDIA Nsight Systems
Using concurrent CUDA streams

Program

The program can be found here.

Language

The course will be held in English.

Instructor

Dr. Sebastian Kuckuk, certified NVIDIA DLI Ambassador.

The course is co-organised by NHR@FAU and the NVIDIA Deep Learning Institute (DLI).

Prices and Eligibility

The course is co-organized by EUMaster4HPC. Students of the EUMaster4HPC program at a participating university are given priority.

The remaining seats are open for other students and members of universities participating in the EUMaster4HPC program.

Withdrawal Policy

Please only register for the course if you are really going to attend. No-shows will be blacklisted and excluded from future events. If you want to withdraw your registration, please send e-mail to sebastian.kuckuk@fau.de.

Monday, March 4
- Mon, Mar 4
- Tue, Mar 5
- 1
  
  Welcome and Introduction
- 2
  
  Module 1 -- Accelerating Applications with CUDA C/C++
- 10:15 AM
  
  Coffee Break
- 3
  
  Module 1 continued
- 11:30 AM
  
  Coffee Break
- 4
  
  Module 2 -- Managing Accelerated Application Memory
Tuesday, March 5
- Mon, Mar 4
- Tue, Mar 5
- 5
  
  Module 2 continued
- 10:15 AM
  
  Coffee Break
- 6
  
  Module 3 -- Asynchronous Streaming and Visual Profiling
- 11:30 AM
  
  Coffee Break
- 7
  
  Module 3 continued
- 8
  
  Closing

Choose timezone