Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Name: Data Parallelism: How to Train Deep Learning Models on Multiple GPUs
Start: 2023-11-06T09:00:00+01:00
End: 2023-11-06T17:00:00+01:00
Location: No location set

November 6, 2023

Europe/Berlin timezone

Overview
Timetable
Contribution List
Registration

Contact

office@nhr.kit.edu

Stochastic Gradient Descent and the Effects of Batch Size

Nov 6, 2023, 9:15 AM

Learn the significance of stochastic gradient descent when training on multiple GPUs

Understand the issues with sequential single-thread data processing and the theory behind speeding up applications with parallel processing.
Understand loss function, gradient descent, and stochastic gradient descent (SGD).
Understand the effect of batch size on accuracy and training time with an eye towards its use on multi-GPU systems.

There are no materials yet.

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Contact

Stochastic Gradient Descent and the Effects of Batch Size

Description

Presentation materials

Choose timezone

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Contact

Description

Presentation materials