Speakers
Description
In typical reinforcement learning applications for accelerators, system dynamics often vary, leading to
decreased performance in trained agents. In certain scenarios, this performance degradation is severe,
necessitating retraining. However, employing meta-reinforcement learning in conjunction with an
appropriate simulation can enable an agent to rapidly adapt to environmental changes. This concept
is illustrated by meta-training an agent within a simulated environment replicating the electron line of
CERN’s AWAKE experiment. The task involves guiding the electron towards a specific trajectory.
During the simulation, the quadrupoles of the segment are varied randomly, and action masking is
employed to mimic magnetic control faults. Our findings reveal that the agent can quickly adjust to
specific system configurations with minimal steps. This methodology holds potential for application
in any Partially Observable Markov Decision Process (POMDP) characterised by slowly evolving
hidden parameters.
Possible contributed talk | No |
---|---|
Are you a student? | No |