Do you really think about consequences? Bridging Classical Control and Reinforcement Learning for Delayed Outcome Optimisation

Not scheduled
20m
CSSB Building 15 - Lecture Hall (DESY)

CSSB Building 15 - Lecture Hall

DESY

Notkestraße 85, 22607 Hamburg, Germany

Speaker

Olga Mironova (PLUS University Salzburg)

Description

This study explores advanced strategies for optimal control in systems with delayed consequences, using beam steering in the AWAKE electron line at CERN as a benchmark. We formulate the task as a constrained optimization problem within a continuous, primarily linear Markov Decision Process (MDP), incorporating measured system parameters and realistic termination criteria. A wide range of approaches is implemented and compared, including classical response matrix inversion, control-theoretic methods, reinforcement learning, and structured model-based techniques.

While classical methods like matrix inversion offer accurate convergence, they fail to account for delayed effects and are sensitive to noise. Control-theoretic approaches, such as Model Predictive Control (MPC), leverage known dynamics and handle delays effectively when models are available. Data-driven methods, including Proximal Policy Optimization (PPO), adapt to uncertainty and non-linearities but require large amounts of data. Structured GP-MPC bridges both paradigms by learning system dynamics using Gaussian Processes while respecting the problem’s causal structure, significantly improving robustness and sample efficiency.

Our experiments highlight key performance differences, particularly in how each method handles delayed outcomes, noise, and structural assumptions. We find that exploiting the causal structure of the problem provides a notable advantage, and that method choice ultimately involves trade-offs between adaptability, data efficiency, and computational cost. These findings offer guidance for applying advanced control strategies in high-dimensional, partially structured environments.

Primary authors

Olga Mironova (PLUS University Salzburg) Simon Hirlaender (PLUS University Salzburg)

Co-authors

Lorenz Fischl (PLUS University Salzburg) Thomas Gallien (JOANNEUM RESEARCH)

Presentation materials

There are no materials yet.