PRL @ ICAPS 2025
ICAPS’25
Melbourne, Victoria, Australia - Melbourne Connect - Room 1
Date: November 10, 2025
prl.theworkshop@gmail.com
Aim and Scope
While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations.
This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and (reinforcement) learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and (reinforcement) learning, in particular, those that focus on intelligent decision-making. This is the ninth edition of the PRL workshop series that started at ICAPS 2020.
PRL aims to coordinate with the workshops Reliability In Planning and Learning (RIPL) and Language Models for Planning (LM4Plan), with PRL covering the general intersection of learning and planning, RIPL covering the reliability-related aspects and LM4Plan covering the language model-related aspects of these areas. Joint sessions across workshops are a possibility that we will evaluate depending on submissions and workshop timing.
Topics of Interest
We invite submissions at the intersection of AI Planning and (reinforcement) Learning. The topics of interest include, but are not limited to, the following
- Reinforcement learning (model-based, Bayesian, deep, hierarchical, etc.)
- Learning for planning (L4P)
- Generalized planning
- Monte Carlo planning
- Model representation
- Model learning
- Planning using approximated/uncertain (learned) models
- Learning search heuristics for planner guidance
- Theoretical aspects of planning and reinforcement learning
- Dataset and Benchmarks across planning and RL
- Action policy analysis or certification
- Reinforcement learning and planning competition(s)
- Multi-agent planning and learning
- Applications of both (reinforcement) learning and planning
Important Dates
- Paper submission deadline:
August 1August 8, AOE - Paper acceptance notification: August 31, AOE
ICAPS will be in-person this year. Authors of accepted workshop papers are expected to physically attend the conference and present in person.
Schedule
| Time (Melbourne) | Title |
|---|---|
| 8:30 am | Opening Remarks |
| 8:30 - 09:30 am | Keynote Marcus Hutter |
| 09:30 - 10:00 am | Session I |
| Srinivas Nedunuri - Synthesis of Shields for Safe Reinforcement Learning in Discrete and Continuous State and Action Spaces | |
| Sukai Huang, Shu-Wei Liu, Nir Lipovetzky, Trevor Cohn - The Dark Side of Rich Rewards: Understanding and Mitigating Noise in VLM Rewards | |
| 10:00 - 10:30 am | Coffee Break |
| 10:30 am - 11:15 am | Session II |
| Dillon Ze Chen, Till Hofmann, Toryn Q. Klassen, Sheila A. McIlraith - MOOSE: Satisficing and Optimal Generalised Planning via Goal Regression | |
| Simon Ståhlberg, Hector Geffner - First-Order Representation Languages for Goal-Conditioned RL | |
| Nicola J. Müller, Moritz Oster, Timo P. Gros - Learning Per-Domain Generalizing Policies Using Offline Reinforcement Learning | |
| 11:15 am - 12:00 pm | Session III |
| Forest Agostinelli, Shahaf S. Shperberg - Learning to Learn from Search | |
| Gal Hadar, Forest Agostinelli, Shahaf S. Shperberg - Beyond Single-Step Updates: Reinforcement Learning of Heuristics with Limited-Horizon Search | |
| Rojina Panta, Vedant Khandelwal, Celeste Veronese, Amit Sheth, Daniele Meli, Forest Agostinelli - Inductive Logic Programming for Heuristic Search | |
| 12:00 - 13:30 pm | Lunch Break |
| 13:30 - 14:30 pm | Keynote Charles Gretton: How I Learned to Stop Worrying and Trust the ML |
| 14:30 - 15:00 pm | Session IV |
| Jonas Ehrhardt, Johannes Schmidt, René Heesch, Oliver Niggemann - Using Gradient-based Optimization for Planning with Deep Q-Networks in Parametrized Action Spaces | |
| Viraj Parimi, Brian C. Williams - Risk-Bounded Multi-Agent Visual Navigation via Dynamic Budget Allocation | |
| 15:00 - 15:30 pm | Coffee Break |
| 15:30 - 16:30 pm | Poster Session |
| 16:30 - 17:00 pm | Session V |
| Daniel Höller - Learning Heuristic Functions for HTN Planning | |
| Ian Turner, Peng Fu, Forest Agostinelli - Quantum Circuit Synthesis with Deep Reinforcement Learning and Heuristic Search | |
| 17:00 pm | Closing Remarks |
Program
Keynotes
Marcus Hutter - Title tba
Marcus Hutter
Senior Researcher, DeepMind and Professor, Australian National University
Biography
Marcus Hutter is Senior Researcher at Deep Mind and Professor in the RSCS at the Australian National University. He received his PhD and BSc in physics from the LMU in Munich and a Habilitation, MSc, and BSc in informatics from the TU Munich. Since 2000, his research at IDSIA, ANU, and DeepMind has centered around the information-theoretic foundations of inductive reasoning and reinforcement learning, which has resulted in 200+ publications and several awards. His books on Universal Artificial Intelligence develop the first sound and complete theory of super-intelligent machines (ASI). He also runs the Human Knowledge Compression Contest (500’000€ H-prize).
Abstract
tba
Charles Gretton - How I Learned to Stop Worrying and Trust the ML
Charles Gretton
Associate Professor, Australian National University
Biography
Charles Gretton is an Associate Professor in the School of Computing at the Australian National University (ANU) and the Director of Attention and Innovation for the Integrated AI Network. A researcher in Artificial Intelligence for over 20 years, he has contributed algorithms and analysis in the fields of automated planning, automated reasoning, and machine learning. His research has been applied to a wide range of fields, including astronomy, mobile robotics, and large-scale business optimisation. In 2015, he co-founded the AI company HIVERY, at which he developed retail business optimisation solutions used in the USA, Japan, and Australia. He currently leads a research partnership with the Australian Institute of Health and Welfare, cybersecurity-related projects with Australian Commonwealth Government organisations, and is the Chief Scientific Investigator on a research agreement between ANU and the International Atomic Energy Agency. Released software tools associated with his recent research include: the HPC SAT/#SAT tool Dagster, the related property directed reachability system parallel-pdr, and the state estimation tool for terrestrial instrumentation.
Abstract
This talk presents a journey through research at the nexus of reinforcement learning and artificial intelligence planning. It explores the enduring challenge of creating general policies for sequential decision-making, particularly in problems with non-Markovian characteristics. We begin with foundational work in relational reinforcement learning that directly sought to bridge the gap between symbolic AI and statistical learning. This early research demonstrates how generalised policies can be learned over symbolic state representations, using techniques such as gradient-based policy optimisation and first-order logical regression to automate feature discovery.
The talk then pivots to contemporary challenges, where deep learning has become indispensable for reasoning under uncertainty in complex applications. This shift is illustrated through a case study in state estimation for terrestrial astronomy, where a deep neural network is trained to infer a latent, unobserved state of the world from sensor data. This technique has been employed as a state representation within a reinforcement learning agent. We then address the critical issue of model trustworthiness. We will present recent work on the formal verification of learned models, specifically using model checking techniques to prove properties of Physics-Informed Neural Networks (PINNs) that represent state with complex physical dynamics. There are compelling applications of planning to operation of complex systems, and the speaker argues the best approach is to build a full AI stack, starting with robust and performant state estimation.
Talks
Select accepted papers are given a slot in the program: 11 minutes for content + 4 minutes for questions.
Poster Sessions
The program includes a poster sessions. All accepted papers must present a poster in the poster session.
List of Accepted Papers
- [talk+poster] First-Order Representation Languages for Goal-Conditioned RL, Simon Ståhlberg, Hector Geffner
- [talk+poster] Inductive Logic Programming for Heuristic Search, Rojina Panta, Vedant Khandelwal, Celeste Veronese, Amit Sheth, Daniele Meli, Forest Agostinelli
- [talk+poster] The Dark Side of Rich Rewards: Understanding and Mitigating Noise in VLM Rewards, Sukai Huang, Shu-Wei Liu, Nir Lipovetzky, Trevor Cohn
- [poster] Deep Reinforcement Learning for Rapid Spacecraft Science Operations Scheduling to Maximize Science Return, Alex M. Zhang, Lara Waldrop
- [talk+poster] Learning Heuristic Functions for HTN Planning, Daniel Höller
- [talk+poster] Risk-Bounded Multi-Agent Visual Navigation via Dynamic Budget Allocation, Viraj Parimi, Brian C. Williams
- [talk+poster] Beyond Single-Step Updates: Reinforcement Learning of Heuristics with Limited-Horizon Search, Gal Hadar, Forest Agostinelli, Shahaf S. Shperberg
- [talk+poster] MOOSE: Satisficing and Optimal Generalised Planning via Goal Regression, Dillon Ze Chen, Till Hofmann, Toryn Q. Klassen, Sheila A. McIlraith
- [talk+poster] Quantum Circuit Synthesis with Deep Reinforcement Learning and Heuristic Search, Ian Turner, Peng Fu, Forest Agostinelli
- [talk+poster] Learning Per-Domain Generalizing Policies Using Offline Reinforcement Learning, Nicola J. Müller, Moritz Oster, Timo P. Gros
- [poster] Multi-Agent Deep Reinforcement Learning for UAV Flocking and Collision Avoidance in Challenging Environments, Mohammad Reza Rezaee, Nor Asilah Wati Abdul Hamid
- [talk+poster] Using Gradient-based Optimization for Planning with Deep Q-Networks in Parametrized Action Spaces, Jonas Ehrhardt, Johannes Schmidt, René Heesch, Oliver Niggemann
- [talk+poster] Learning to Learn from Search, Forest Agostinelli, Shahaf S. Shperberg
- [talk+poster] Synthesis of Shields for Safe Reinforcement Learning in Discrete and Continuous State and Action Spaces, Srinivas Nedunuri
Submission Details
We solicit workshop paper submissions relevant to the above call of the following types:
- Long papers – up to 8 pages + unlimited references / appendices
- Short papers – up to 4 pages + unlimited references / appendices
- Extended abstracts – up to 2 pages + unlimited references/appendices
Please format submissions in ICAPS style (see instructions in the Author Kit). Authors submitting papers rejected from other conferences, please ensure you do your utmost to address the comments given by the reviewers. Please do not submit papers that are already accepted for the main ICAPS conference to the workshop. As this workshop is non-archival, you may submit already accepted papers from other conferences if they fit the workshops’s scope.
Some accepted long papers will be invited for contributed talks and potentially also a slot in the poster presentation session. All other accepted papers (long and short) and accepted extended abstracts will be given a slot in the poster presentation session. Extended abstracts are intended as brief summaries of already published papers, preliminary work, position papers, or challenges that might help bridge the gap.
As the main purpose of this workshop is to solicit discussion, the authors are invited to use the appendix of their submissions for that purpose.
Paper submissions should be made through OpenReview.
We do not insist on papers being submitted anonymously initially; this decision is left to the discretion of the author. If a paper is simultaneously being considered at a venue where anonymity is required, you have the option to submit it without author details, considering the possibility of a shared reviewer pool. However, please be aware that upon acceptance, the paper will be publicly posted on the PRL website with full author information.
Organizing Committee
- Zlatan Ajanović, RWTH Aachen University, Aachen, Germany.
- Forest Agostinelli, University of South Carolina, Columbia, USA.
- Dillon Ze Chen, Laboratory for Analysis and Architecture of Systems (LAAS-CNRS), Toulouse, France.
- Floris den Hengst, Vrije Universiteit, Amsterdam, Netherlands.
- Timo P. Gros, German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany.
- Ayal Taitler, Ben-Gurion University, Be’er Sheva, Israel.
Please send your inquiries to prl.theworkshop@gmail.com