3rd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'25)

Name: 3rd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'25)
Start: 2025-04-02T08:30:00+02:00
End: 2025-04-04T16:30:00+02:00
Location: DESY

Apr 2 – 4, 2025

DESY

Europe/Berlin timezone

Contact

rl4aa@desy.de

Toward a Structured Reinforcement Learning Pipeline for Real-World Systems

Not scheduled

20m

CSSB Building 15 - Lecture Hall (DESY)

CSSB Building 15 - Lecture Hall

DESY

Notkestraße 85, 22607 Hamburg, Germany

Poster Poster session

Georg Schäfer

Recent advances in reinforcement learning (RL) have shown great potential for managing complex systems in robotics, manufacturing, and beyond. However, translating RL successes from controlled experiments to real-world scenarios remains a significant challenge due to the absence of a standardized engineering pipeline that prioritizes thorough problem formulation. While data science and control engineering benefit from robust workflows, such as CRISP-DM and standard control-design processes, current RL practices often focus narrowly on hyperparameter tuning, overlooking foundational tasks like environmental design and reward specification.

In this work, we introduce an RL engineering pipeline that bridges existing gaps by integrating several key components. First, our approach begins with systematic problem identification and the formalization of the Markov Decision Process - defining states, actions, and rewards in alignment with physical constraints. Next, we ensure a careful setup of the environment, optimization objectives, and training procedures that respect the inherent nature of the optimization problem. These steps are followed by iterative agent training, hyperparameter optimization, evaluation, and eventual deployment. By adapting best practices from both CRISP-DM and classical control, our methodology enhances reproducibility and efficiency in RL development.

We validate our framework through a case study on a 1-degree-of-freedom (1-DoF) helicopter testbed. Our experiments indicate that targeted modifications, such as normalizing observations, randomizing initial conditions, extending episode horizons, and incorporating action penalties, yield measurable improvements in sample efficiency and training stability in both simulation and hardware. Moreover, our results suggest that this pipeline can be generalized to more complex systems, paving the way for more robust real-world RL applications.

Overall, our framework not only clarifies design decisions in RL but also offers a promising pathway to overcome long-standing challenges in deploying RL solutions outside of controlled experimental settings.

Georg Schäfer Dr Jakob Rehrl (Salzburg University of Applied Sciences) Simon Hirlaender (PLUS University Salzburg) Dr Stefan Huber (Salzburg University of Applied Sciences)

There are no materials yet.

3rd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'25)

Contact

Toward a Structured Reinforcement Learning Pipeline for Real-World Systems

CSSB Building 15 - Lecture Hall

DESY

Speaker

Description

Primary authors

Presentation materials

Choose timezone

3rd collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'25)

Contact

Speaker

Description

Primary authors

Presentation materials