GridKa School 2017 - make science && run

Europe/Berlin
KIT, Campus North, FTU

KIT, Campus North, FTU

Description

The International GridKa School 2017 is one of the leading summer schools for advanced computing techniques in Europe. The school provides a forum for scientists and technology leaders, experts, and novices to facilitate knowledge sharing and information exchange. The target audience is different groups like graduate and PhD students, advanced users as well as IT administrators. GridKa School is hosted by Steinbuch Centre for Computing (SCC) of Karlsruhe Institute of Technology (KIT). It is organized by KIT and the HGF Alliance "Physics at the Terascale".

Participants
  • Achim Streit
  • Aleksander Paravac
  • Alexander Schug
  • Amin Parvizi
  • Andreas Albert
  • Andreas Bartschat
  • Andreas Heiss
  • Andreas Herten
  • Andreas Petzold
  • André Schneider
  • Anna Lehner
  • Asfaw Yohannes
  • Bas Wegh
  • Ben Jones
  • Benedikt Hegner
  • Benjamin Stein
  • Bernd Wiebelt
  • Christian Schober
  • Christian Voss
  • Christoph Genster
  • Christoph Heidecker
  • Christoph Petrausch
  • Christoph-Erdmann Pfeiler
  • Christopher Jung
  • Christopher Stihl
  • Corina Mitrohin
  • Daniel Savoiu
  • Daniel Schmid
  • Daniel Troendle
  • Daniela Piccioni Koch
  • Dario Mapelli
  • David Walz
  • Dennis Hoppe
  • Diana Gudu
  • Diego Ciangottini
  • Doris Wochele
  • Eileen Kühn
  • Elnaz Azmi
  • Elvin Sindrilaru
  • Ewan Barr
  • Gabriel Zachmann
  • Genrich Ivaska
  • Geonmo Ryu
  • Gianfranco Sciacca
  • Gino Marchetti
  • Graeme Stewart
  • Hartmut Häfner
  • Haykuhi Musheghyan
  • Holger Kluck
  • Ingrid Schäffner
  • Iurii Sorokin
  • Ivan Kondov
  • Jakob Rosenbauer
  • Jens Reinhardt
  • Joeran Stettner
  • Johannes Möller
  • Johannes Scheuermann
  • Johannes Stegmaier
  • Julia Stoll
  • Kai Krings
  • Kiran Adhikari
  • Lars Franke
  • Linda Neubrand
  • Lisa Schumacher
  • Lukas Burgey
  • Lutz Schimpf
  • Manuel Giffels
  • Marco Berghoff
  • Marco Langer
  • Marcus Strobl
  • Mario Lassnig
  • Martin Heck
  • Martin Kohn
  • Martin Stahlberg
  • Maryam Salehi
  • Matthias Huber
  • Matthias Schnepf
  • Max Fischer
  • Maximilian Reininghaus
  • Mehari Bayou Zerihun
  • Mehmet Soysal
  • Meisam Booshehri
  • Melanie Ernst
  • Michael Bontenackels
  • Michael Eliachevitch
  • Michael Waßmer
  • Miriam Künzel
  • Mirko Kämpf
  • Modan Liu
  • Mohamed Khafagy
  • Mohammad Mirkazemi
  • Momin Ahmad
  • Moris Riedel
  • Nicholas Tan Jerome
  • Nico Madysa
  • Nico Struckmann
  • Oliver Langen
  • Oliver Ricken
  • Oskar Taubert
  • Pavel Weber
  • Peer Hasselmeyer
  • Philipp Kemkes
  • Preslav Konstantinov
  • Rainer Blatt
  • Raphael Friese
  • Robin Hahn
  • Samuel Ambroj Pérez
  • Sebastian Racs
  • Sebastian Wozniewski
  • Sebastien Binet
  • Shiraz Memon
  • Stefan Wunsch
  • Sébastien Gadrat
  • Theo Glauch
  • Thomas Keck
  • Timo Bingmann
  • Tobias Böwing
  • Ugur Cayoglu
  • Vincent Kitali
  • Vytautas Jancauskas
  • Wonqook Choi
  • Yannick Nesselhauf
    • 12:00 PM 2:00 PM
      Registration Foyer (FTU)

      Foyer

      FTU

    • 2:00 PM 3:30 PM
      Plenary Aula (FTU)

      Aula

      FTU

      • 2:00 PM
        Welcome to KIT 30m
        Speaker: Prof. Achim Streit (KIT)
        Slides
      • 2:30 PM
        Welcome to GridKa School 20m
        Speaker: Dr Manuel Giffels (Karlsruhe Institute of Technology)
        Slides
      • 2:50 PM
        WikiToLearn: What it is and how to use it 40m
        WikiToLearn is a platform for the collaborative creation of textbooks, which are released with the CC-BY-SA license. The talk will cover the main characteristics of the platform and how to add, edit and organise content it into a textbook.
        Speaker: Dario Mapelli (WikiToLearn)
    • 3:30 PM 4:00 PM
      Coffee Break 30m Foyer (FTU)

      Foyer

      FTU

    • 4:00 PM 6:00 PM
      Plenary Aula (FTU)

      Aula

      FTU

      • 4:00 PM
        Collaborative Software Development 40m
        Speaker: Dr Benedikt Hegner (CERN)
        Slides
      • 4:40 PM
        Beewax: An optimized tool to enhance the performance of HIVE 40m
        The Beeswax tool can improve and optimize the storing and processing Big Data in HIVE using a set of modules that concern with 1-Translate advanced SQL queries to HIVEQL 2- Optimizing Multi-JOIN query in Map-Reduce jobs 3-reducing recourse consumption by avoiding long shuffling time 4- Reuse intermediate results, 5- concerns about Multi-Query Optimization.
        Speaker: Dr Mohammed Khafagy (Fayoum University)
        Slides
      • 5:20 PM
        Statistical Analysis in Biomolecular Evolution 40m
        Methods based on statistical analysis of large sets of data have led to groundbreaking discoveries in diverse scientific fields. These range from physics by explaining phenomenological results to applications in chemistry, the life sciences but also in, e.g., social sciences or economics. In the past two decades, as one of the results of the human genome project, sequential genomic data has experienced an exponential growth. This wealth of genomic data has boosted research in analyzing molecular evolution. One striking example is tracing residue co-evolution in biomolecules to predict spatial adjacencies. These can be exploited in biomolecular structure prediction even on large scales or for experimentally poorly accessible systems. Apart from the structural insight, the statistical model can also be interpreted in terms of fitness landscapes which allows to make, e.g., predictions on antibiotics resistance, drug design, biological signaling, epistatic effects or protein/ protein interactions. Another impressive example is the analysis of molecular evolution with the explicit aim of better understanding disease, such as HIV-1 viral evolution or immune system strategies to recognize and fight pathogens. Overall, understanding molecular evolution is a prime example how knowledge from statistical analysis allows for paradigm change in the life sciences.
        Speaker: Dr Alexander Schug
    • 9:00 AM 10:20 AM
      LSDMA Symposium Aula (FTU)

      Aula

      FTU

      • 9:00 AM
        Welcome to LSDMA Symposium 20m
        Speaker: Prof. Michael Decker (KIT)
        Slides
      • 9:20 AM
        The NFFA Europe Information and Data Repository Platform 30m
        Speaker: Stefano Cozzini (CNR-IOM DEMOCRITOS)
        Slides
      • 9:50 AM
        Towards the Fenix Infrastructure 30m
        In the context of the Human Brain Project a set of European supercomputing centres have committed themselves to develop and deploy a set of services that will be federated across the involved sites. This effort currently involves five centres from five different countries, namely BSC (Spain), CEA (France), CINECA (Italy), CSCS (Switzerland) and JSC (Germany). The resulting infrastructure will comprise scalable compute resources, data services as well as interactive compute services. While the infrastructure will be made available to several research communities, the Human Brain Project is currently the prioritised driver for the Fenix infrastructure design and implementation.

        In this talk we give an overview on the design principles and discuss selected use cases, which the infrastructure need to support. We will provide an overview on the envisage architecture and the strategies for addressing key architectural and technical challenges.

        Speaker: Prof. Dirk Pleiter (University of Regensburg)
        Slides
    • 10:20 AM 10:50 AM
      Coffee Break 30m Foyer (FTU)

      Foyer

      FTU

    • 10:50 AM 11:50 AM
      LSDMA Symposium Aula (FTU)

      Aula

      FTU

      • 10:50 AM
        AI-driven decision automation in physics research and enterprises 30m
        A neural network algorithm originally written for physics analyses at CERN is the foundation of many successful projects and products of Blue Yonder, one of the very few AI companies that already deliver huge value to their customers. With its recent focus on building supply chain and pricing products for retailers, strategic management decisions are broken down to tens of millions of automated operational decisions with superhuman quality. The principle and some examples from research and industry are presented.
        Speaker: Prof. Michael Feindt (Blue Yonder)
        Slides
      • 11:20 AM
        Helix Nebula Science Cloud 30m
        The work of Helix Nebula [1] has shown that is it feasible to interoperate in-house IT resources of research organisations, publicly funded e-infrastructures, such as EGI [2] and GEANT [3], with commercial cloud services. Such hybrid clouds are in the interest of the users and funding agencies because they provide greater “freedom and choice” over the type of computing resources to be consumed and the manner in which they can be obtained.

        Propelled by the growing IT needs of the Large Hardon Collider, CERN is leading a H2020 Pre-Commercial Procurement activity that brings together a group of 10 of Europe’s leading research organisations to procure innovative IaaS level cloud services for a range scientific disciplines.[4] HNSciCloud is divided in 3 phases: Design, Prototype and Pilot. The successful designs were selected by the end of 2016. In 2017, the project entered the Prototype phase.

        This talk will cover the current status of the prototypes and the next steps going into the Pilot phase.

        [1] http://www.helix-nebula.eu/

        [2] http://www.egi.eu/

        [3] http://www.geant.net/

        [4] http://www.hnscicloud.eu

        Speaker: João Fernandes (CERN)
        Slides
    • 11:50 AM 12:50 PM
      Lunch Break 1h Foyer (FTU)

      Foyer

      FTU

    • 12:50 PM 2:00 PM
      LSDMA Symposium Aula (FTU)

      Aula

      FTU

      • 12:50 PM
        Accelerating Storage System Research Through a Common Framework 30m
        JULEA is a flexible storage framework that contains all the necessary building blocks for storage research. It runs completely in user space, which eases development and debugging. The framework allows offering arbitrary client interfaces to applications; its data and metadata backends can be freely accessed and thus allow rapidly prototyping new approaches.
        Speaker: Dr Michael Kuhn (Uni Hamburg)
        Slides
      • 1:20 PM
        Research Data and Management in Environmental Sciences - Characteristics, Status and Challenges 30m
        Research in environmental sciences is widely 'digitized' and a plethora of approaches and initiatives for environmental (research) data management and infrastructures established. However, still a number of issues hinder seamless data integration for environmental researchers and reproducible research is far from being reality.

        This contribution will give an overview on the current status, list some promising approaches as well as current r&d challenges.

        Speaker: Prof. Lars Bernard (Technische Universität Dresden)
        Slides
      • 1:50 PM
        Conclusions 10m
        Speaker: Prof. Achim Streit (KIT)
    • 2:15 PM 6:15 PM
      Tutorials
      • 2:15 PM
        An Introduction to Using HTCondor 4h Room 157 (FTU)

        Room 157

        FTU

        HTCondor is a batch computing system designed for high throughput computing. It's widely used in the High Energy Physics community and for both experiment and computing site workflows, as well as other Sciences and Industry. This tutorial will introduce how it works, and how to use it. From simple job submission to complex DAGs, HTCondor is a useful tool for many different workflows.
        Speaker: Mr Ben Jones (CERN)
      • 2:15 PM
        C++ for Beginners 4h Room 162 (FTU)

        Room 162

        FTU

        This course aims to equip people of limited knowledge in C++ with both, a better understanding of the scope of C++ and practical, immediately useful hints to improve code with respect to correctness, readability and maintainability. The discussed topics are mostly adapted from the celebrated "Effective... " book series from Scott Myers.
        Speaker: Dr Martin Heck (KIT)
      • 2:15 PM
        Elasticsearch, Logstash, Kibana hands-on (Canceled) 4h Room 162 (FTU)

        Room 162

        FTU

        The aim of this tutorial is to prepare participants for own deployment of the ELK stack for collecting and log analysis.

        Participants will be taken step by step through installation, configuration and use of all needed tools for modern log analysis.

        You will need to have ssh capable computer with web browser and basic knowledge of unix and CLI.

        Speaker: Alexandr Mikula (Czech Academy of Sciences)
      • 2:15 PM
        Introduction to Erlang 4h Room 164 (FTU)

        Room 164

        FTU

        Explaining the basic concepts of the Erlang programming language and testing these by a simple game using Erlang. As this course will be done on Linux systems, you should be able to use the command line and have some basic programming understanding.
        Speaker: Bas Wegh (KIT)
      • 2:15 PM
        Introduction to the SciPy stack and IPython Notebooks 4h Room 156 (FTU)

        Room 156

        FTU

        Python provides a rich ecosystem of open-source software for mathematics, science, and engineering. This tutorial will introduce you to the fundamental packages of the SciPy stack.

        You will learn how-to: perform fast numerical calculations in N dimensions using NumPy, analyse your data using Pandas, and visualize the results using Matplotlib. The exercises will be performed in the Jupyter Notebook environment, which you can access through your web browser.

        You will need a tablet or a laptop and basic knowledge of the Python programming language.

        Speaker: Thomas Keck (KIT)
      • 2:15 PM
        Supervised Machine Learning with Deep Neural Networks 4h Aula (FTU)

        Aula

        FTU

        Machine learning with deep neural networks has seen tremendous advances in the last few years and is now the state-of-the-art method in a broad range of fields, including computer vision and natural language processing. Deep learning shines when dealing with large bodies of high-dimensional, complex data and is thus well suited for pushing the limits in high-energy particle and astroparticle physics.

        This tutorial will introduce you to the fundamental concepts and some advanced techniques in deep learning and give you a hands-on introduction to designing, training and evaluating neural networks in supervised classification and regression tasks. As deep learning framework we will use TensorFlow via the Python interface (some familiarity with Python is assumed). The exercises will be performed on the VISPA platform which provides an analysis environment and access to a GPU cluster through your web browser. Hence all you will need is a tablet or laptop.

        Speaker: Dr David Walz (RWTH Aachen)
    • 6:30 PM 10:00 PM
      Tarte Flambée Evening SCC

      SCC

    • 9:00 AM 10:20 AM
      Plenary Aula (FTU)

      Aula

      FTU

      • 9:00 AM
        Big data challenges in radio astronomy 40m
        James Clerk Maxwell once wrote: “In every branch of knowledge the progress is proportional to the amount of facts on which to build, and therefore to the facility of obtaining data”. This maxim is particularly true in astronomy, where to probe deeper into the Universe we must continually seek to build more powerful telescopes that produce ever more data. In the case of radio astronomy, “powerful” implies larger collecting areas, broader bandwidths and wider fields-of-view. Interferometers, networks of telescopes unified via high performance computing, are the means by which we may satisfy this trio of requirements. In this talk I will discuss the largest of all the currently planned interferometers, the Square Kilometre Array (SKA), an instrument that will produce petabytes of data per second and require exascale computation. I will look at the challenges this instrument poses and the tools, techniques, prototypes and pathfinders in development to ensure that these challenges are met successfully.
        Speaker: Dr Ewan Barr (University of Bonn)
        Slides
      • 9:40 AM
        Augmented Reality - History, Challenges and Applications 40m
        The presentation gives an overview of the historical development of Augmented Reality. The focus is on different technologies for the development of AR applications. Based on example applications from different areas, the presentation will show in which areas AR is already successfully used and how AR can change the digital world.
        Speaker: Jens Reinhardt (HTW)
        Slides
    • 10:20 AM 10:40 AM
      Coffee Break 20m Foyer (FTU)

      Foyer

      FTU

    • 10:40 AM 12:00 PM
      Plenary Aula (FTU)

      Aula

      FTU

      • 10:40 AM
        Cloud Federation 40m
        Cloud federation is receiving increasing attention due to the benefits of resilience and locality it brings to cloud providers and users. I present an approach to cloud network federation based on direct cloud-to-cloud agreements. The solution uses multi-path proxies able to utilize the multiple uplinks cloud data centers typically have. Bandwidth of interconnection and resilience are improved without the need for additional fail-over mechanisms.
        Speaker: Dr Peer Hasselmeyer (NEC)
        Slides
      • 11:20 AM
        Architecture and Principles of Kubernetes 40m
        2014 Google open sourced the Kubernetes project, a container orchestration platform, to simplify container scheduling and orchestration. Kubernetes is built with the knowledge of Borg (the internal scheduler at Google) and Omega. Kubernetes allows an easy extension for custom use cases on top of a sold API-based platform. Two years ago the Kubernetes release 1.0 was published by Google. Since then many big companies like IBM, CoreOS, RedHat and Microsoft are building on Kubernetes and contribute to this project. This talk will give an overview of the architecture and principals of Kubernetes and how Kubernetes supports you building a modern infrastructure.
        Speaker: Johannes Scheuermann (Inovex GmbH)
        Slides
    • 12:00 PM 1:00 PM
      Lunch Break 1h Canteen

      Canteen

    • 1:00 PM 6:00 PM
      Tutorials
      • 1:00 PM
        Advanced Python Software Development 5h Aula (FTU)

        Aula

        FTU

        Python has been widely adopted in academia, science and beyond. As the language is easy to pick up, many people use it for scripting, configuration, and prototyping. At the same time, its flexibility, breadth of application and huge ecosystem make it a powerful tool even for large projects.

        This course focuses on software development with Python beyond simple scripting and prototyping. Topics range from best practices for programming small and large projects, to organising and packaging frameworks as well as developing high performance applications. Each topic is presented as a mixture of general lectures and hands-on exercises.

        The course targets intermediate Python developers who are familiar with the language itself. You should feel comfortable writing small scripts and applications, using functions, classes and existing libraries. We highly recommend to use your own laptop (Linux, MacOS, Cygwin) for the exercises.

        The course is co-organised with the Collaborative Software course, and participants will benefit from taking both courses. In addition, we recommend to participate in the "Introduction to Jupyter Notebooks (Python)" course in case you would like to revive your basic knowledge of the language.

        Speakers: Eileen Kühn (Karsruhe Institute of Technology), Dr Max Fischer (Karlsruhe Institute of Technology)
      • 1:00 PM
        Apache Spark 5h Room 157 (FTU)

        Room 157

        FTU

        Speaker: Dr Mirko Kämpf (Cloudera, Inc.)
      • 1:00 PM
        Docker 5h Room 156 (FTU)

        Room 156

        FTU

        In order to benefit to the maximum of the Docker tutorial part, there are some pre-requisites one needs to take into consideration. First of all, you should be comfortable working with the Linux terminal, installing packages over the command line, using the ssh client to connect to a remote machine and last but not least, editing files using one of the common editors in Linux: emacs, vi, nano etc.

        The Docker tutorial will walk you through the basic steps of setting up a Docker environment on your machine. There will be a series of exercises that will detail the various concepts presented during the plenary talk which are critical that you understand for the later part of the tutorial. The final goal of the tutorial is to build and deploy a couple of containers that replicate the usual analysis workflow in High Enery Physics: you will have a container running a XRootD server providing the storage for the data and a different container that runs the ROOT framework where you will do your analysis. The tutorial will discuss into depth the concepts of port forwarding, volumes and resource management in the context of containers with a focus on understanding the advantages of containers over traditional virtual machines.

        Speaker: Elvin Sindrilaru (CERN)
      • 1:00 PM
        Hands on Kubernetes 5h Room 163 (FTU)

        Room 163

        FTU

        The scope of this Workshop covers the architecture as well as the concepts of kubernetes.

        You are taken on a journey from your first container startup to more complex setups inside the kubernetes cluster.

        Basic understanding of unix and the cli is required as well as a ssh capable computer, since you need to connect via ssh to our learning environment.

        Speakers: Benjamin Stein (INOVEX), Christoph Petrausch (INOVEX)
      • 1:00 PM
        Parallel Programming with MPI and OpenMP 5h Room 164 (FTU)

        Room 164

        FTU

        Speaker: Hartmut Häfner (KIT)
    • 6:30 PM 8:00 PM
      Evening Lecture Aula (FTU)

      Aula

      FTU

      • 6:30 PM
        Welcome Reception 30m
      • 7:00 PM
        The Quantum Way of Doing Computations (Evening Lecture) 1h
        Since the mid-nineties of the 20th century, it became apparent that one of the centuries’ most important technological inventions, that is computers in general, and many of their applications can be further enhanced by using operations based on quantum physics. This is timely since the classical roadmap for the development of computational devices, commonly known as Moore’s law, will cease to be applicable within the next decade. This is due to the ever-smaller sizes of electronic components that soon will enter the quantum physics realm. Computations, whether they happen in our heads or with any computational device, always rely on real physical processes, which are data input, data representation in a memory, data manipulation using algorithms and finally, the data output. Building a quantum computer then requires the implementation of quantum bits (qubits) as storage sites for quantum information, quantum registers and quantum gates for data handling and processing and the development of quantum algorithms. In this talk, the basic functional principle of a quantum computer will be reviewed. It will be shown how strings of trapped ions can be used to build a quantum information processor and how basic computations can be performed using quantum techniques. Routes towards a scalable quantum computer will be discussed.
        Speaker: Prof. Rainer Blatt (Universität Innsbruck)
    • 9:00 AM 10:20 AM
      Plenary Aula (FTU)

      Aula

      FTU

      • 9:00 AM
        Developments in data protection from a technical perspective 40m
        Speaker: Julia Stoll (Der Hessische Datenschutzbeauftragte)
        Slides
      • 9:40 AM
        Grow concurrent programs with grace and Go 40m
        Speaker: Dr Sebastien Binet (LPC)
        Slides
    • 10:20 AM 10:40 AM
      Coffee Break 20m Foyer (FTU)

      Foyer

      FTU

    • 10:40 AM 12:00 PM
      Plenary Aula (FTU)

      Aula

      FTU

      • 10:40 AM
        Concurrency and Scientific Programming 40m
        Modern CPU architectures for scientific computing are characterised by having multiple CPU cores and wide vector registers. Effective use of these features requires different design patterns from those that ran on the single core systems of the past. Instead of serial processing, parallel processing needs to be at the heart of the new model, with many operations proceeding concurrently. In this talk I discuss what that challenge means for scientific code and which programming patterns are most likely to be useful to scientists. I will look at some common libraries used to safely implement concurrency and give some examples from various fields. I will also emphasise the role of memory layout in achieving good throughput in high performance scientific code.
        Speaker: Dr Graeme Stewart (University of Glasgow)
        Slides
      • 11:20 AM
        GPU Programming 101 40m
        GPUs, Graphics Processing Units, offer a large amount of processing power by providing a platform for massively parallel computing. They have the ability to greatly increase the performance of scientific applications on a single workstation computer; and they also power the fastest supercomputers in the world. This talk will give an overview about the specifics of GPU computing and the underlying techniques and will introduce different programming models (mainly CUDA and OpenACC).
        Speaker: Dr Andreas Herten (FZ Jülich)
        Slides
    • 12:00 PM 1:00 PM
      Lunch Break 1h Canteen

      Canteen

    • 1:00 PM 6:00 PM
      Tutorials
      • 1:00 PM
        Collaborative Software Development 5h Room 156 (FTU)

        Room 156

        FTU

        Writing maintainable software is a prerequisite in many fields. Especially when working in projects with many members it is essential to

        • write readable software and documentation,
        • enable versioning of software,
        • ensure correctness of software,
        • enable automated tests of software, and
        • enable agile workflows based on issue tracking.
        However, the goals of maintainable software are not only relevant when working in teams, but also in private projects. This makes the topic relevant for anybody that needs to write and maintain software.

        Based on experiences from projects in academia and industry, this tutorial introduces tools and concepts to enable maintainable software projects in collaborative environments. While we try to give a broad overview on different topics, we also flexibly provide in-depth information depending on your feedback during the course. We cover topics such as version control and organisation of software with git, concepts of unit testing and test-driven development, tools supporting continuous integration as well as the integration into wikis and ticket systems.

        Throughout this tutorial you will learn how to efficiently integrate different tools and concepts to enable maintainable software. After the course, you will have a basic setup that can be adapted to your specific needs.

        This course is a hands-on tutorial and requires basic knowledge in Python programming. For best learning experiences and an overview on encompassing software development processes, we suggest the combined participation in the workshop Advanced Python Software Development and Collaborative Software Development.

        Speakers: Eileen Kühn (Karlsruhe Institute of Technology), Dr Max Fischer (Karlsruhe Institute of Technology)
      • 1:00 PM
        Concurrent Programming in C++ 5h Room 163 (FTU)

        Room 163

        FTU

        In this course we will introduce how to program for concurrency in C++, taking advantage of modern CPUs' ability to run multi-threaded programs on different CPU cores. We will briefly review the native C++ concurrency features for asynchronous execution, thread spawning and locking as well as a few other features useful for concurrent programming. The tutorial will show you how to use Intel's Threaded Building Block (TBB) library as a much higher level abstraction onto concurrency that allows concurrent applications to be developed much more quickly. We will examine TBB's basic templates for parallel programming, controlling loops and reductions in 1 and 2 dimensions. Then we will see how TBB's graph execution facilities that allow more sophisticated parallel workflows in a DAG to be run. Finally, we will look at the TBB task manager, that allows arbitrary workloads to be executed in parallel when injected into the system by a higher level component.

        Students should be familiar with C++ and the standard template library. Some familiarity with makefiles and/or CMake would be useful.

        Speaker: Dr Graeme Stewart (University of Glasgow)
      • 1:00 PM
        Databases for Big Data Analytics and Machine Learning 5h Aula (FTU)

        Aula

        FTU

        In this workshop, the students will (a) learn how to efficiently use relational and non-relational databases, and (b) how to create database workflows suitable for analytics and machine learning.

        First, the focus of the workshop is to teach efficient, safe, and fault-tolerant principles when dealing with high-volume and high-throughput database scenarios. This includes, but is not limited to, systems such as PostgreSQL, Redis or ElasticSearch. Topics include query planning and performance analysis, transactional safety, SQL injection, and competitive locking.

        Second, we focus on how to actually prepare data from these databases to be usable for analytics and machine learning frameworks such as Keras. Topics include recommended workflows for data selection, data cleaning, model training, model running, error checking, and output archival.

        An intermediate understanding of Python, SQL, and Linux shell scripting is recommended to follow this course. An understanding of machine learning principles is not required.

        Speaker: Dr Mario Lassnig (CERN)
      • 1:00 PM
        Introduction to Go 5h Room 157 (FTU)

        Room 157

        FTU

        Introduction

        In this workshop, we will introduce the basics of programming in Go and then work our way up to concurrency programming with this relatively new language.

        We'll start with the usual "Hello World" program, introduce functions, variables, packages and then interfaces. Then, we will tackle the two main tools at the disposal of the Go programmer (colloquially known as a gopher): the channels and the goroutines. This will be done by implementing a small peer to peer application transmitting text messages over the network.

        The workshop wraps up with a whirlwind tour of scientific and non-scientific libraries readily available, and prospects/news about the next Go version.

        References

        • https://golang.org
        • https://tour.golang.org
        • https://talks.golang.org

        People will have to install the Go compiler on their laptop. The instructions to do so for their favorite operating system are detailed at: https://golang.org/doc/install

        To get a taste of what Go looks like and wet their feet, people can also follow the interactive, browser-based, installation-free tour from: https://tour.golang.org

        Speaker: Dr Sebastien Binet (LPC Clermont Ferrand)
        Slides
      • 1:00 PM
        Practical Introduction to GPU Programming with OpenACC 5h Room 164 (FTU)

        Room 164

        FTU

        OpenACC is a directive-based programming model for highly parallel systems, which allows for automated generation of portable GPU code. In this tutorial, we will get to know the programming model with examples, learn how to use the associated tools environment, and incorporate first strategies for performance optimization into our programs.
        Speaker: Dr Andreas Herten (FZ Jülich)
      • 1:00 PM
        Scientific workflows with FireWorks 5h Room 162 (FTU)

        Room 162

        FTU

        Scientific workflow is an important technique used in many simulation and data analysis applications. In particular, workflows automate high-throughput / high-complexity computing applications, enable code and data reuse and provenance, provide methods for validation and error tracking, and exploit application concurrency using distributed computing resources. The goal of this tutorial is to learn composing and running workflow applications using the FireWorks workflow environment (https://hackingmaterials.lbl.gov/fireworks). In the first part, after an introduction to the concept of workflows, to state-of-the-art workflow systems and to FireWorks, the participants will learn to construct workflows using a library of existing Firetasks. The composed workflows will be verified, visualized and then executed and monitored. The included exercises will include managing data and control flow dynamically using FWAction. The second part will focus on writing custom Firetasks to match more specific application requirements. Basic knowledge of using the bash shell is required. For the second part, basic knowledge of Python is required.
        Speaker: Dr Ivan Kondov (KIT)
        Slides
    • 8:00 PM 11:00 PM
      School Dinner Leonardo Hotel

      Leonardo Hotel

    • 9:00 AM 10:20 AM
      Plenary Aula (FTU)

      Aula

      FTU

      • 9:00 AM
        Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++ 40m
        We present on-going work on a new distributed Big Data processing framework called Thrill. It is a C++ framework consisting of a set of basic scalable algorithmic primitives like mapping, reducing, sorting, merging, joining, and additional MPI-like collectives. This set of primitives goes beyond traditional Map/Reduce and can be combined into larger more complex algorithms, such as WordCount, PageRank, k-means clustering, and suffix sorting. These complex algorithms can then be run on very large inputs using a distributed computing cluster.
        Speaker: Timo Bingmann (KIT)
        Slides
      • 9:40 AM
        Deep Learning Applications in Science using Transfer Learning 40m
        Deep learning models like convolutional neural networks (CNNs) deliver highly accurate results in classification tasks but require large enough data sets and good corresponding labels. However, one key problem in science and engineering is that data sets unfortunately have often only limited labeled data. Using CNNs together with such data sets can be problematic because it can lead to enourmous overfitting thus loosing much of the generalization capability of the model. The talk informs about research methods of using generic representations from deep learning networks that can facilitate transfer learning between different domains in cases where limited amount of labeled data is available. Examples are given in the scientific domain of remote sensing where the availability of labels is scarce and would involve extended efforts and costs for acquiring like performing ground truth campaigns.
        Speaker: Prof. Morris Riedel (FZ Jülich)
    • 10:20 AM 10:40 AM
      Coffee Break 20m Foyer (FTU)

      Foyer

      FTU

    • 10:40 AM 12:00 PM
      Plenary Aula (FTU)

      Aula

      FTU

      • 10:40 AM
        ViCE 40m
        Speaker: Bernd Wiebelt (University of Freiburg)
        Slides
      • 11:20 AM
        Conclusions 40m
        Speaker: Dr Manuel Giffels (Karlsruhe Institute of Technology)
        Slides