Aug 26 – 30, 2019
KIT, Campus North, FTU
Europe/Berlin timezone

Introduction to HTCondor

Aug 28, 2019, 1:15 PM
4h 45m
156 (FTU)

156

FTU

Hands-On Tutorial Tutorials

Speaker

Oliver Freyermuth (University of Bonn)

Description

How to distribute your compute tasks and get results with high performance, keeping machines and site admins joyful

HTCondor is an open source workload management system for High Throughput Computing designed to collect many different resources (servers from different computing centres, desktops or cloud services) into one common computing environment. These resources are transparently exposed to the users.
HTCondor is not only used in the High Energy Physics community and CERN batch services, but is also widely adopted in other science areas and industry. It integrates support for several container runtimes which allows to make use of software stacks defined by the user or offered by a site or the community.

Compared to other well-known workload managers, it does not make use of the concept of different queues or partitions, but applies a fair-share algorithm to distribute resources dynamically according to the users' requests.
A flexible mechanism called "ClassAds" is used to represent characteristics and constraints of machines and jobs allows for very dynamic configuration both for users and administrators.

In this tutorial, we will start with simple job submissions to illustrate how jobs are matched and data is transfered by HTCondor, continue with more complex batch submission examples and also discuss DAGs which can be used to express complex inter-job dependencies and full analysis workflows.
Care will be taken to illustrate different models of how HTCondor may be operated at various sites and how to use it in a well-performing way depending on that. We will also briefly discuss how containers can simplify or complicate your workflow in the context of HTCondor.

It will be assumed the participants are already familiar with Linux. If you have already come into contact with analysis workflows or a local computing cluster, this is also a very welcome ingredient.

Presentation materials