KIT - Karlsruher Institut für Technologie
← Back to Research
Reinforcement Learning

Reinforcement Learning

Our group works on reinforcement learning algorithms, where the goal is to optimize the parameters of a policy purely based on environment interaction. This learning paradigm is specifically interesting for robotic use cases because the robot can potentially learn solving problems autonomously. Our group focuses on several aspects in reinforcement learning.

One key question is how to represent the policy in RL in order to acquire skills that can solve complex tasks? A look into advances in supervised learning show that we need reinforcement learning algorithms that can train complex policy representations. These policy representations also include diffusion/flow based policies that have proven very successful in other supervised learning fields and are mostly part of big models such as in VLAs, but training them in the context of reinforcement learning is not straightforward.

Training these policies also requires improved exploration behavior in order to generate high-quality data points during training. One promising option for improved exploration is generating time-correlated actions during the exploration phase using action chunking, or motion primitives. The underlying policies have a drastically increased action space making the optimization in general more complex, which also requires specific RL algorithms for efficient training.

A very promising approach towards generalists policies is based on fine tuning existing foundation-model based policy representations. Vision-Language-Action (VLAs) policies have the potential to solve tasks where the instruction is text-based easying communication with humans. We believe that RL is a key approach in this field to enable robots adapting their generalist policies using reinforcement learning such that they can learn solving a task that is not part of their supervised training data set.

Key Areas

  • Complex policy representations for reinforcement learning
  • Time-correlated action selection using action chunking and motion primitives
  • Fine-tuning of foundation models using reinforcement learning

Members

Onur Celik

Reinforcement Learning

Tai Hoang

Reinforcement learning and physical modelling with graphs.

Max Nagy

FZI

Reinforcement Learning, Real World Robotics, Heavy Machines

Emiliyan Gospodinov

Vision-Language-Action Models, Reinforcement Learning

Weiran Liao

Reinforcement Learning, Imitation Learning

Serge Thilges

Reinforcement Learning, Diffusion Models, Dexterous Hands.

Andreas Boltres

SAP

Computer Networking, Geometric Deep Learning, Multi-Agent Reinforcement Learning

Huy Le

Bosch

Robot Manipulation, Diffusion Policies, and Massively Parallel Reinforcement Learning

Publications

2025

Scaffolding Dexterous Manipulation with Vision-Language Models

Scaffolding Dexterous Manipulation with Vision-Language Models

Vincent de Bakker, Joey Hejna, Tyler Ga Wei Lum, Onur Celik, Aleksandar Taranovic, Denis Blessing, Gerhard Neumann, Jeannette Bohg, Dorsa Sadigh
Preprint · 2025
TROLL: Trust Regions improve Reinforcement Learning for Large Language Models

TROLL: Trust Regions improve Reinforcement Learning for Large Language Models

Philipp Becker, Niklas Freymuth, Serge Thilges, Fabian Otto, Gerhard Neumann
Preprint · 2025
Dime: Diffusion-based maximum entropy reinforcement learning

Dime: Diffusion-based maximum entropy reinforcement learning

Onur Celik, Zechu Li, Denis Blessing, Ge Li, Daniel Palenicek, Jan Peters, Georgia Chalvatzaki, Gerhard Neumann
ICML · 2025
Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects

Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects

Tai Hoang, Huy Le, Philipp Becker, Vien Anh Ngo, Gerhard Neumann
ICLR · 2025 Oral
Enhancing Exploration With Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

Enhancing Exploration With Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation

Huy Le, Tai Hoang, Miroslav Gabriel, Gerhard Neumann, Ngo Anh Vien
IEEE Robotics and Automation Letters · 2025
Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation

Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation

Fabian Otto, Philipp Becker, Vien Anh Ngo, Gerhard Neumann
ICLR · 2025
Chunking the critic: A transformer-based soft actor-critic with N-step returns

Chunking the critic: A transformer-based soft actor-critic with N-step returns

Dong Tian, Ge Li, Hongyi Zhou, Onur Celik, Gerhard Neumann
Preprint · 2025

2024

Beyond Shortest-Paths: A Benchmark for Reinforcement Learning on Traffic Engineering

Beyond Shortest-Paths: A Benchmark for Reinforcement Learning on Traffic Engineering

Andreas Boltres, Niklas Freymuth, Patrick Jahnke, Gerhard Neumann
· 2024
Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

Andreas Boltres, Niklas Freymuth, Patrick Jahnke, Holger Karl, Gerhard Neumann
Preprint · 2024
Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts

Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts

Onur Celik, Aleksandar Taranovic, Gerhard Neumann
ICML · 2024
PointPatchRL--Masked Reconstruction Improves Reinforcement Learning on Point Clouds

PointPatchRL--Masked Reconstruction Improves Reinforcement Learning on Point Clouds

Balázs Gyenes, Nikolai Franke, Philipp Becker, Gerhard Neumann
CORL · 2024
Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

Robust Black-Box Optimization for Stochastic Search and Episodic Reinforcement Learning

Maximilian Hüttenrauch, Gerhard Neumann
JMLR · 2024
Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning

Ge Li, Hongyi Zhou, Dominik Roth, Serge Thilges, Fabian Otto, Rudolf Lioutikov, Gerhard Neumann
ICLR · 2024
Top-erl: Transformer-based off-policy episodic reinforcement learning

Top-erl: Transformer-based off-policy episodic reinforcement learning

Ge Li, Dong Tian, Hongyi Zhou, Xinkai Jiang, Rudolf Lioutikov, Gerhard Neumann
Preprint · 2024
Efficient Off-Policy Learning for High-Dimensional Action Spaces

Efficient Off-Policy Learning for High-Dimensional Action Spaces

Fabian Otto, Philipp Becker, Ngo Anh Vien, Gerhard Neumann
Preprint · 2024

2023

Reinforcement Learning from Multiple Sensors via Joint Representations

Reinforcement Learning from Multiple Sensors via Joint Representations

Philipp Becker, Sebastian Markgraf, Fabian Otto, Gerhard Neumann
Reinforcement Learning Conference (RLC) · 2023
Swarm Reinforcement Learning for Adaptive Mesh Refinement

Swarm Reinforcement Learning for Adaptive Mesh Refinement

Niklas Freymuth, Philipp Dahlinger, Tobias Daniel Würth, Simon Reisch, Luise Kärger, Gerhard Neumann
NeurIPS · 2023
MP3: Movement Primitive-Based (Re-) Planning Policy

MP3: Movement Primitive-Based (Re-) Planning Policy

Fabian Otto, Hongyi Zhou, Onur Celik, Ge Li, Rudolf Lioutikov, Gerhard Neumann
Preprint · 2023

2022

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

On Uncertainty in Deep State Space Models for Model-Based Reinforcement Learning

Philipp Becker, Gerhard Neumann
TMLR · 2022
Specializing Versatile Skill Libraries using Local Mixture of Experts

Specializing Versatile Skill Libraries using Local Mixture of Experts

Onur Celik, Dongzhuoran Zhou, Ge Li, Philipp Becker, Gerhard Neumann
CORL · 2022
Deep Black-Box Reinforcement Learning with Movement Primitives

Deep Black-Box Reinforcement Learning with Movement Primitives

Fabian Otto, Onur Celik, Hongyi Zhou, Hanna Ziesche, Ngo Anh Vien, Gerhard Neumann
CORL · 2022
Push-to-See: Learning Non-Prehensile Manipulation to Enhance Instance Segmentation via Deep Q-Learning

Push-to-See: Learning Non-Prehensile Manipulation to Enhance Instance Segmentation via Deep Q-Learning

BARIS SERHAN, HARIT PANDYA, AYSE KUCUKYILMAZ, GERHARD NEUMANN
ICRA · 2022

2021

Differentiable Trust Region Layers for Deep Reinforcement Learning

Differentiable Trust Region Layers for Deep Reinforcement Learning

Fabian Otto, Philipp Becker, Ngo Anh Vien, Hanna Carolin Ziesche, Gerhard Neumann
ICLR · 2021
Residual Feedback Learning for Contact-Rich Manipulation Tasks with Uncertainty

Residual Feedback Learning for Contact-Rich Manipulation Tasks with Uncertainty

Alireza Ranjbar, Ngo Anh Vien, Hanna Ziesche, Joschka Boedecker, Gerhard Neumann
Preprint · 2021