2nd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'24)

Name: 2nd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'24)
Start: 2024-02-05T08:05:00+01:00
End: 2024-02-07T16:30:00+01:00
Location: Universität Salzburg (Paris-Lodron-Universität)

Feb 5, 2024, 8:05 AM → Feb 7, 2024, 4:30 PM Europe/Berlin

Blue lecture hall (Universität Salzburg (Paris-Lodron-Universität))

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Hellbrunnerstrasse 34 5020 Salzburg

Simon Hirlaender (PLUS University Salzburg), Andrea Santamaria Garcia (KIT), Sabrina Pochaba, Chenran Xu (IBPT), Jan Kaiser (DESY)

Description

After the success of the 1st collaboration workshop last year RL4AA'23 and the founding of the Reinforcement Learning for Autonomous Accelerators community, the second collaboration workshop will take place in February 2024 in Salzburg. Please visit the workshop's website for more information: https://rl4aa.github.io/RL4AA24/

Registrations are open!
Call for abstracts extended! [new deadline 5 January 2024]
- We are looking for facility overview talks, contributed talks, and student talks.

Contact

simon.hirlaender@plus.ac.at

Registration

Participants

ADELANA HENRY OLABODE
ADNAN GHRIBI
Alexander Schütt
Andrea Santamaria Garcia
Andriy Ushakov
Anna Götz
Annika Eichler
Antonin Raffin
Antonio Manjavacas
Arne Bathke
Auralee Edelen
Benjamin Halilovic
Borja Rodriguez Mateos
Carsten Limbach
Catherine Laflamme
Chenran Xu
Christian Borgelt
Claudius Krause
Daniel Göller
Daniel Ratner
Daohan Wang
Eleonora Vanzan
Fabian Köhnke
Felix Berkenkamp
Georg Schäfer
Gregor Hartmann
Hannes Waclawek
Haris Ali
Ibon Bustinduy Uriarte
Jake Flowerdew
Jan Kaiser
Jannis Lübsen
Jonathan Edelen
Juan Luis Muñoz
Juan Manuel Montoya Bayardo
Kevin Cassou
Leander Grech
Line Le
Lorenz Fischl
Luca Scomparin
Marco Tschimpke
Matthew Schwab
Michael Schenk
Nico Madysa
Nicolas Dietrich
Nicolas Leclercq
Nikola Milosevic
Niky Bruchon
Patrick Langthaler
Raoul Kutil
Sabrina Appel
Sabrina Pochaba
Sebastian Baron
Simon Hirlaender
srivathsa v
Verena Kain

Mon, February 5
- 8:30 AM → 9:00 AM
  
  Registration & Arrival Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 9:00 AM → 9:30 AM
  
  Main: Opening Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  RL4AA24_opening_Simon_Hirlaender.pdf
- 9:30 AM → 10:30 AM
  Keynote: Antonin Raffin Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  2024-02-05_rlaa_workshop.pdf
  - 9:30 AM
    
    Designing and Running Real-World RL Experiments 1h
    
    This talk covers the challenges and best practices for designing and running real-world reinforcement learning (RL) experiments.
    The idea is to walk through the different steps of RL experimentation (task design, choosing the right algorithm, implementing safety layers) and also provide practical advice on how to run experiments and troubleshoot common problems.
    
    Slides are also online: https://araffin.github.io/slides/design-real-rl-experiments/
    
    Speaker: Antonin Raffin
    
    araffin_keynote.pdf
- 10:30 AM → 11:00 AM
  
  Break 30m Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 11:00 AM → 12:00 PM
  Keynote: Felix Berkenkamp Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  2024-02-05_rlaa_workshop.pdf
  - 11:00 AM
    
    Towards real-world RL 1h
    
    Speaker: Felix Berkenkamp
    
    2024-02-05_rlaa_workshop-1.pdf
- 12:00 PM → 1:00 PM
  
  Lunch 1h Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 1:00 PM → 3:00 PM
  Invited Talks Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 1:00 PM
    
    Real-time control with reinforcement learning at KARA 30m
    
    Reinforcement Learning (RL) has demonstrated its effectiveness in solving control problems in particle accelerators. A challenging application is the control of the microbunching instability (MBI) in synchrotron light sources. Here the interaction of an electron bunch with its emitted coherent synchrotron radiation leads to complex non-linear dynamics and pronounced fluctuations.
    
    Addressing the control of intricate dynamics necessitates meeting stringent microsecond-level real-time constraints. To achieve this, RL algorithms must be deployed on a high-performance electronics platform. The KINGFISHER system, utilizing the AMD-Xilinx Versal family of heterogeneous computing devices, has been specifically designed at KIT to tackle these demanding conditions. The system implements an experience accumulator architecture to perform online learning purely through interaction with the accelerator while still satisfying strong real-time constraints.
    
    The preliminary results of this innovative control paradigm at the Karlsruhe Research Accelerator (KARA) will be presented. Notably, this represents the first experimental attempt to control the MBI with RL using online training only and running on hardware.
    
    Speaker: Dr Andrea Santamaria Garcia (KIT)
    
    2024-02-06_KIT_overview_talk.pdf
  - 1:30 PM
    
    Optimization at the GSI/FAIR accelerator facility 30m
    
    The complexity of the GSI/FAIR accelerator facility demands a high level of automation in order to maximize time for physics experiments. This talk will give an overview of different optimization problems at GSI, from transfer lines to synchrotrons to the fragment separator. Starting with a summary of previous successful automation, the talk will focus on the latest developments in recent months, such as the optimization of multi-turn injection in the SIS18 synchrotron. The introduction of a Python bridge to the settings management system LSA and the integration of GeOFF (Generic Optimization Framework & Frontend) enabled and facilitated beam-based optimization with numerical algorithms and machine learning. Geoff is an open-source framework that harmonizes access to a number automation techniques and simplifies the transition towards and between them.
    
    Speaker: Sabrina Appel (GSI)
    
    rl_salzburg_gsi_sappel.pdf
    
    rl_salzburg_gsi_sappel.pptx
    
    VID_20231124_215643454.mp4
  - 2:00 PM
    
    Optimization and Control at DESY - RL and Beyond 30m
    
    DESY has many years on experience on optimization and control of particle accelerators. Reinforcement learning has been explored within the last three years. In this talk the results of this investigation are summarized and an outlook is given. Further control and optimization challenges for operation are presented and discussed.
    
    Speaker: Annika Eichler (DESY)
    
    RL4AA24 DESY Facility Talk Slides.pptx
  - 2:30 PM
    
    ML applications at BESSY 30m
    
    In order to improve BESSY's experimental environment, several ML-based applications are used at HZB. These efforts cover challenges arising at the accelerator, beamlines and detectors at the experiment. This talk provides on overview of these activities focussing on RL and providing insights in the optimization of a beamline, tuning of an e-gun as well as electron beam positioning in BESSY's storage ring. The limitations of RL and the reason to also use user ML-techniques are also discussed and presented by various examples.
    
    Speaker: Dr Gregor Hartmann (HZB)
- 3:00 PM → 3:15 PM
  
  Group Photo Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 3:15 PM → 4:00 PM
  
  Break 45m Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 4:00 PM → 5:00 PM
  Invited Talks Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 4:00 PM
    
    Reinforcement Learning for CERN's accelerators 30m
    
    CERN has a long tradition of model-based feedforward control with a high-level of abstraction. With the recently approved project “Efficient Particle Accelerators”, the CERN management commits to go one step further and invest heavily into automation on all fronts. The initiative will therefore also further push data-driven surrogate models, sample-efficient optimisation and continous control algorithms into the current control system. Reinforcement Learning has been part of the CERN algorithm suite before many numerical optimisation algorithms. The many decades old CERN machines do however not easily provide for RL to be used - black-box optimisation algorithms are more easily integrated. This contribution will summarise RL controllers in the making for CERN and will mainly focus on CERN’s RL vision - offline RL, the importance of being able to deal with partially observable systems, and the necessity for continuously learning controllers.
    
    Speaker: Dr Verena Kain (CERN)
    
    RL_CERN_vkain_Feb24.pdf
  - 4:30 PM
    
    RL Activities Overview at SLAC (preliminary title) 30m
    
    Speaker: Dr Auralee Edelen (SLAC)
    
    20240204_RL4AA_Edelen.pdf
- 5:00 PM → 6:00 PM
  RL Crash Course Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  RL_crash_course.pdf
  - 5:00 PM
    
    RL Crash Course and Question Round 1h
    
    RL_crash_course.pdf
- 7:00 PM → 9:00 PM
  
  Social Events: Dinner Sternbräu
  
  Sternbräu
  
  Griesgasse 23, 5020 Salzburg, Österreich
Tue, February 6
- 9:10 AM → 10:30 AM
  Student Session Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 9:10 AM
    
    Exploring the Dynamics of Reinforcement Learning in Aerospace Control 20m
    
    The Quanser Aero2 system is an advanced laboratory experiment designed for exploring aerospace control systems, featuring two motor-driven fans on a pivot beam for precise control. Its capability to lock axes individually offers both single degree of freedom (DOF) and two DOF operation. The system’s non-linear characteristics and adaptability to multivariable configurations make it especially interesting for control theory research.
    
    In this study, we use Reinforcement Learning (RL) to control the Aero2 system. To keep complexity low in a first step, this work focuses on the single DOF setup. A RL agent is trained on a simulation to develop a policy for orienting the beam to a specific tilt, using the target’s tilt deviation and velocity as the state space. To further reduce complexity, the second motor uses reversed polarity voltage from the first motor, resulting in a single action, enabling an in-depth analysis of the learning behaviour of the employed agents.
    
    Even if we reduce the action space to one dimension by exploiting the symmetry of the two rotors, the given balancing task could not be solved with the default configuration of the used Proximal Policy Optimization (PPO) agent. We identified that a reduction of the number of units in each hidden fully connected layer of the agent networks is necessary to solve the task. However, detailed visualisations of the development of the policy over time revealed a transition from stable to volatile action choices in the long term, which is unexpected according to the current state of the literature. Future research will focus on the underlying causes of the observed volatility, giving insights into the dynamic nature of RL.
    
    Speaker: Georg Schäfer (Josef Ressel Centre for Intelligent and Secure Industrial Automation)
    
    RL4AA_Presentation_Aero.pdf
  - 9:30 AM
    
    Utilizing Machine Learning-optimized Piecewise Polynomial Models in Mechatronics 20m
    
    The success and fast pace of Machine Learning (ML) in the past decade was also
    enabled by modern gradient descent optimizers embedded into ML frameworks such
    as TensorFlow. In the context of a doctoral research project, we investigate how
    these optimizers can be utilized directly, outside of the scope of neural
    networks. This approach holds the potential of optimizing explainable models
    with only few model parameters allowing to derive properties for direct,
    physical explanation and interpretation, like velocity, acceleration or jerk.
    This is highly beneficial for use in the field of mechatronics. However, while
    modern gradient gradient descent optimizers shipped with ML frameworks perform
    well in neural nets, results show that most optimizers have limited capabilities
    when applied directly to PP models. Domain-specific model requirements like
    C^k-continuity, acceleration or jerk limitation as well as spectral or energy
    optimization pose the need for developing appropriate loss functions, novel
    algorithms as well as regularization techniques in order to improve optimizer
    performance.
    
    In this context, we investigate piecewise polynomial models as they occur
    (and are required) in 1D trajectory planning tasks in mechatronics. Utilizing
    TensorFlow optimizers, we optimize our PP model towards multi-targeted loss
    functions suitable for fitting of C^k-continuos PP functions which can be
    deployed in an electronic cam approximation setting. We enhance capabilities of
    our PP base model by utilizing an orthogonal Chebyshev basis along with a novel
    regularization method improving convergence of the approximation and
    continuity optimization targets. We see a possible application of this approach
    in Deep Reinforcement Learning applied to Control Theory. By exchanging
    the black box that is a neural network with an explainable PP model, we foster
    utility of Reinforcement Learning in designing cyber-physical control systems.
    
    Speaker: Mr Hannes Waclawek (osef Ressel Centre for Intelligent and Secure Industrial Automation)
    
    slides-hwaclawek-ml-optimized-pp-mechatronics.pdf
  - 9:50 AM
    
    Robustly Safe Bayesian Optimization 20m
    
    Safety guarantees for Gaussian processes require the assumption that the true hyperparameters are known. However, this assumption usually does not hold in practice. In this talk, a method is introduced to overcome this issue which estimates confidence intervals of hyperparameters from their posterior distributions. Finally, it can be shown that via appropriate scaling safeness can be robustly guaranteed with high probability.
    
    Speaker: Jannis Lübsen (TUHH/DESY)
    
    main.pdf
  - 10:10 AM
    
    Comparing Q-Learning of Single-Agent and Multi-Agent Reinforcement Learning 20m
    
    Reinforcement Learning (RL) is a rising subject of Machine Learning (ML). Especially Multi-Agent RL
    (MARL), where more than one agent interacts with an environment by learning to solve a task, can model
    many real-world problems. Unfortunately, the Multi-Agent case yields more difficulties in the already chal-
    lenging field of Reinforcement Learning, like scalability issues, non-stationarity or non-unique learning goals.
    To better understand these problems, we compare Single-Agent RL with MARL in the simple board
    game Tic-Tac-Toe. This game is a two-player zero-sum game, meaning that two adversarial players compete
    against each other during the game by setting their marks (x or o) on a 3x3 board. If one player has three
    of his marks in one line (vertical, horizontal or diagonal), this player wins the game and ends it. If neither
    of the players gets three marks in one line until all fields of the 3x3 boards are filled, the game ends with a
    draw.
    We study the learning of a Single- and a Multi-Agent system playing Tic-Tac-Toe, using a Q-Learning
    algorithm that describes the learning of the agent in one formula. As typical in RL, the agent interacts with
    an environment during learning, which is, in this case, the 3x3 board of the Tic-Tac-Toe game. The two
    playing agents set their marks one after another. At the end of each game, every agent gets a reward based on
    the game’s outcome. During learning, the agent tries to maximize his reward, which leads to a well-playing
    strategy, namely the policy.
    We show that a Single-Agent RL agent only performs as well as the opponent against whom he is trained,
    while the agents in the MARL scenario learn an optimal strategy against every possible opponent. Addition-
    ally, the agents in the MARL learn more quickly than the ones in the Single-Agent case.
    We will use these results to set up a MARL setting in network communications. In this scenario, all
    communicating electronic devices are different agents, that should communicate in a reliable way, using as
    many resources as possible for a quick communication without disturbing the communication of the other
    devices
    
    Speaker: Sabrina Pochaba
    
    StudentTalkRL4AA24.pptx
- 10:30 AM → 11:00 AM
  
  Break 30m Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 11:00 AM → 12:00 PM
  
  Tutorial: Lecture Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  In the tutorial, we will look at meta reinforcement learning and model-based RL techniques for the AWAKE trajectory tuning task.
  
  https://github.com/RL4AA/rl4aa24-tutorial
  
  Convener: Simon Hirlaender (PLUS University Salzburg)
  
  Tutorial_slides_RL4AA24_Simon_Hirlaender.pdf
- 12:00 PM → 1:00 PM
  
  Lunch 1h Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 1:00 PM → 3:00 PM
  
  Tutorial: Hands-On (https://github.com/RL4AA/rl4aa24-tutorial) Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  In the tutorial, we will look at meta reinforcement learning and model-based RL techniques for the AWAKE trajectory tuning task.
  
  https://github.com/RL4AA/rl4aa24-tutorial
  
  Tutorial_slides_RL4AA24_Simon_Hirlaender.pdf
- 3:00 PM → 3:30 PM
  
  Break 30m Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 3:30 PM → 6:00 PM
  
  Posters: Poster Session Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 6:30 PM → 7:30 PM
  
  Social Events: Tour through the Salzburg City (Meeting Point: Mozartstatue) (https://maps.app.goo.gl/r6zHWrfyTsv2X64d9) Mozart Statue
  
  Mozart Statue
- 7:30 PM → 9:30 PM
  
  Social Events: Dinner Krimpelstätter
  
  Krimpelstätter
  
  Müllner Hauptstraße 31 5020 Salzburg
Wed, February 7
- 9:00 AM → 10:00 AM
  Student Session Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 9:00 AM
    
    Optimising non-linear Injection using Reinforcement Learning 20m
    
    Synchrotron light source storage rings aim to maintain a continuous beam current without observable beam motion during injection. One element that paves the way to this target is the non-linear kicker (NLK). The field distribution it generates poses challenges for optimising the topping-up operation.
    
    Within this study, a reinforcement learning agent was developed and trained to optimise the NLK operation parameters. We present the models employed, the optimisation process, and the achieved results.
    
    Speaker: Alexander Schütt (Helmholtz-Zentrum Berlin)
    
    schuett_rl4aa.pdf
  - 9:20 AM
    
    Enhancing Autonomy of Unmanned Surface Vehicles through Integrated Perception and Control 20m
    
    The Sonobot Unmanned Surface Vehicle (USV), developed by EvoLogics, is a system platform tailored for hydrographic surveying in inland waters. Despite its integrated GPS and autopilot system for autonomous mission execution, the Sonobot lacks a collision avoidance system, necessitating constant operator monitoring and significantly limiting its autonomy.
    
    Recognizing the untapped potential of USVs for integrating advancements in autonomous vehicles, machine learning, and control theory, we propose a two-layered system: a perception layer for object detection and an algorithmic layer for collision-free path selection. The novelty of our perception layer lies in the integration of a Stereo Camera, LiDAR, and Front Looking sonar for robust obstacle detection.
    
    For the algorithmic layer, we engineered a simple yet powerful cost function. Our preliminary results demonstrate the ability to calculate a collision-free trajectory for the Sonobot using this cost function in conjunction with a Model Predictive Controller (MPC).
    
    We invite discussion on the potential of testing the MPC against Reinforcement Learning and the possibility of combining MPC and RL to further enhance the autonomy and efficiency of USVs.
    
    Speaker: Juan Montoya Bayardo
    
    Collision Avoidance System for Sonobot-RL4AA-24.pdf
  - 9:40 AM
    
    Applying Reinforcement Learning to IFMIF-DONES HVAC optimisation 20m
    
    As a critical radiological facility, the International Fusion Materials Irradiation Facility - DEMO Oriented Neutron Source (IFMIF-DONES) will implement effective measures to ensure the safety of its personnel and the environment. To enable the proper implementation of these measures, the ISO 17873 standard has been adopted throughout the design process of the facility. The proposed dynamic confinement measures outlined in this standard require a thorough design of the nuclear Heating, Ventilation and Air Conditioning (HVAC) system to ensure effective containment barriers, stable pressure levels and proper treatment of effluents. However, the design and control of such a critical system presents several challenges, as numerous factors influence pressure stability within the facility.
    
    Despite these challenges, recent advances in Deep Reinforcement Learning (DRL) algorithms have demonstrated their effectiveness in solving complex continuous control problems in a variety of domains. In this work, we evaluate the performance of DRL algorithms in controlling the nuclear HVAC system of IFMIF-DONES. For this purpose, we use a MELCOR simulation model of the particle accelerator facility as a training environment and adapt the functionalities of this simulator to enable the continuous control of the air inlet flow rates.
    
    Speaker: Antonio Manjavacas Lucas (University of Granada)
    
    RL4AA_AML.pdf
- 10:00 AM → 11:00 AM
  
  Break 1h Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 11:00 AM → 12:00 PM
  Contributed Talks Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 11:00 AM
    
    Reinforcement Learning for FLASH Dose Delivery Optimization 30m
    
    RadiaSoft is developing machine learning methods to improve the operation and control of industrial accelerators. Because industrial systems typically suffer from a lack of instrumentation and a noisier environment, advancements in control methods are critical for optimizing their performance. In particular, our recent work has focused on the development of pulse-to-pulse feedback algorithms for use in dose optimization for FLASH radiotherapy. The PHASER (pluridirectional high-energy agile scanning electronic radiotherapy) system is of particular interest due to the need to synchronize 16 different accelerators all with their own noise characteristics. This presentation will provide an overview of the challenges associated with dose optimization for a PHASER-like system, a description of the toy model used to evaluate different control schema, and our initial results using RL for dose delivery optimization.
    
    Speaker: Jonathan Edelen (RadiaSoft LLC)
    
    RL4AA_Edelen.pdf
    
    RL4AA_Edelen.pptx
  - 11:30 AM
    
    Considerations on Reinforcement Learning feasibility for the automatic setup of controlled longitudinal emittance blow-up in the CERN SPS 30m
    
    Despite the spreading of Reinforcement Learning (RL) applications for optimizing the performance of particle accelerators, this approach is not always the best choice. Indeed, not all problems are suitable to be solved via RL. Before diving into such techniques, a good knowledge of the problem, the available resources, and the existing solutions is recommended. An example of the complexities related to RL solutions is the automatic setup of controlled longitudinal emittance blow-up in the CERN SPS. Several criticalities, such as the data availability and the increasing problem dimensions, limited the development of an operational tool based on RL. Therefore, the released software relies on generic optimizers only, even if promising results with Bayesian optimization were achieved.
    
    Speaker: Niky Bruchon (CERN)
    
    2024_02_07_RL4AA.pdf
    
    2024_02_07_RL4AA.pptx
- 12:00 PM → 1:00 PM
  
  Lunch 1h Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 1:00 PM → 3:00 PM
  Contributed Talks Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  - 1:00 PM
    
    Practical Microsecond Real-Time Reinforcement Learning 30m
    
    Reinforcement Learning (RL) has been successfully applied to a wide range of problems. When the environment to control does not exhibit stringent real-time constraints, currently available techniques and computational infrastructures are sufficient. At particle accelerators, however, it is often possible to encounter stringent requirements on the time necessary for an action to be chosen, that in some extreme cases can fall in the microsecond scale.
    
    These challenging conditions also present some benefits. For instance, the data throughput of the real-world environment can be orders of magnitude greater compared to a simulation. This opens the possibility of online training without the issues linked to transferring a simulation-trained agent to the real world.
    
    In this contribution, real-time constraints and how they affect RL algorithms will be introduced, followed by a description of FPGAs and heterogeneous hardware platforms. This is then used to motivate the architecture of the state-of-the-art KINGFISHER RL system. Finally, an in-depth discussion of the use-cases where this approach can be beneficial will be provided, together with basic guidelines for structuring RL problems in a more hardware-friendly way.
    
    Speaker: Luca Scomparin
    
    RL4AA_2024.pdf
  - 1:30 PM
    
    Adaptive X-ray Fluorescence Imaging 30m
    
    Speaker: Daniel Ratner (SLAC)
    
    XRF_RL4AA.pptx
  - 2:00 PM
    
    Explainability in Reinforcement Learning: An Application for Powertrain Control 30m
    
    Reinforcement learning (RL), a subgroup of machine learning, has gained recognition for its astonishing success in complex games, however it has yet to show similar success in more real-world scenarios. In principle, the ability for RL to generalise past experience, act in real time, and its resilience to new states makes it particularly attractive as a robust decision-making support for real-world scenarios. However, such scenarios bring unique challenges that aren't present in the game-like domains, such as complex and contradictory reward functions and a necessity for explainability. In this presentation we will discuss some of these challenges in the context of using RL for automotive powertrain control. We will discuss the problem setup, including reward definition, as well as one approach to explainability. This approach is to first learn a neural network based policy (which can learn effectively and efficiently) and then extrace a rule-based policy (which is easier to interpret and can be directly implemented in current control software). The results are benchmarked with an optimised MATLAB policy, using a simulink simulation.
    
    Speaker: Catherine Laflamme (Fraunhofer Austria Research GmbH)
    
    RL4AA_Laflamme.pdf
    
    RL4AA_Laflamme.pptx
  - 2:30 PM
    
    Quantum annealing for sample-efficient reinforcement learning 30m
    
    Free energy-based reinforcement learning (FERL) using clamped quantum Boltzmann machines (QBM) has demonstrated remarkable improvements in learning efficiency, surpassing classical Q-learning algorithms by orders of magnitude. This work extends the FERL approach to multi-dimensional optimisation problems and eliminates the restriction to discrete action-space environments, opening doors for a broader range of real-world applications. We will discuss the results obtained with quantum annealing, employing both a simulator and D-Wave quantum annealing hardware, as well as a comparison to classical RL methods. We will cover how the algorithms are evaluated for control problems at CERN, such as the AWAKE electron beam line, and for classical RL benchmarks of varying degree of complexity.
    
    Speaker: Michael Schenk (CERN)
    
    quantum_rl_070224.pptx
- 3:00 PM → 3:30 PM
  
  Break 30m Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
- 3:30 PM → 4:30 PM
  
  Main: Closing Discussions Blue lecture hall
  
  Blue lecture hall
  
  Universität Salzburg (Paris-Lodron-Universität)
  
  Hellbrunnerstrasse 34 5020 Salzburg
  
  Convener: Simon Hirlaender (PLUS University Salzburg)
  
  RL4AA24_opening_Simon_Hirlaender.pdf

Choose timezone

2nd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'24)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Sternbräu

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Mozart Statue

Krimpelstätter

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)

Blue lecture hall

Universität Salzburg (Paris-Lodron-Universität)