PRL Workshop Series

Bridging the Gap Between AI Planning and Reinforcement Learning

PRL @ AAAI 2025

AAAI’25
Philadelphia, Pennsylvania, USA
Date: March 4, 2025, 8:30–6:00
Room: 120C
prl.theworkshop@gmail.com

❗Important Announcement

Note that PRL will be a full day workshop starting at 8:30 AM and ending at 6:00 PM.

Aim and Scope

While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations.

This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and reinforcement learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and reinforcement learning, in particular, those that focus on intelligent decision-making. This is the eigth edition of the PRL workshop series that started at ICAPS 2020.

Program

The workshop location is Room 120C of the Pennsylvania Convention Center in Philadelphia, Pennsylvania.

Schedule

Time (Philadelphia)	Title
8:30 - 10:30 am	Session I
	Opening Remarks
	AI Planning: A Primer and Survey (Preliminary Report). Dillon Ze Chen, Pulkit Verma, Siddharth Srivastava, Michael Katz, and Sylvie Thiebaux.
	A Benchmark for Hierarchical Parameterized Action Markov Decision Process. Dengxian Yang, Neil Michael Dundon, Elizabeth J Rizor, Scott T. Grafton, Linda Ruth Petzold
	⭐ Keynote Marlos Machado:⭐ Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges
	Contextual Bandits for Maximizing Stimulated Word-of-Mouth Rewards. Ahmed Sayeed Faruk, and Elena Zheleva.
	Planning with temporally-extended actions. Palash Chatterjee, and Roni Khardon.
10:30 - 11:00	Coffee break
11:00 - 12:30	Poster Session
12:30 - 14:00	Lunch
14:00 - 15:30	Session II
	⭐ Keynote Anders Jonsson: ⭐ Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning
	Active Teacher Selection for Reinforcement Learning from Human Feedback. Rachel Freedman, Justin Svegliato, Kyle Hollins Wray, and Stuart Russell.
	HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym. Ngoc La, and Ruaridh Mon-Williams
15:30 - 16:00	Coffee break
16:00 - 18:00	Session III
	⭐ Keynote George Konidaris: ⭐ Signal to Symbol  (via Skills)
	Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. Devdhar Patel, and Hava T Siegelmann.
	Controller Synthesis from Deep Reinforcement Learning Policies. Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, and Guillermo Perez.
	Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning. Huchen Jiang, Yangyang Ma, CHAOFAN DING, Kexin Luan, and XINHAN DI.
	Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining. Yiyu Qian, Tim Miller, and Liyuan Zhao.
	Closing Remarks

Keynotes

Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning

Anders Jonsson

Full Professor, Universitat Pompeu Fabra

Abstract

A well-known limitation of reinforcement learning is its high sample complexity, which causes learning to be slow in complex tasks. The situation is even worse in problems with non-Markovian dynamics, i.e. when the ability to predict the future depends on the entire interaction history. In this talk I will describe two recent lines of work that improve the sample complexity of reinforcement learning by exploiting the symbolic structure of tasks. The first line of work is to train a set of policies for solving subtasks in an existing hierarchy, and combine these policies for fast or zero-shot learning of more complex, global tasks. The second line of work is to learn finite-state automata for non-Markovian decision processes, providing symbolic information that compactly captures the interaction history.

Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges

Marlos Machado

Assistant Professor, University of Alberta

Abstract

The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modelled through temporally extended courses of action called options. Despite their popularity as a research topic, options are rarely a core component of traditional RL solutions. In this talk, I will introduce a general framework for option discovery that leverages the agent’s representation to identify useful options. By using these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively in model-free settings across diverse environments with varying topologies and observation spaces. However, options are far less common–and often less effective–in planning. I will also present insights into making options more useful for planning and the challenges posed by function approximation in this setting.

Signal to Symbol  (via Skills)

George Konidaris

Associate Professor, Brown University

Abstract

I will address the question of how an RL agent with a rich sensorimotor space can learn abstract, task-specific representations of a particular task, and the conditions under which such representations match classical planning paradigms. I will tale a constructivist approach, where the computation the representation is required to support - here, planning using a given set of motor skills - is precisely defined, and then its properties are used to build the representation so that it is capable of doing so by construction. The result is a formal link between the skills available to a robot and the symbols it should use to plan with them. I will present an example of a robot autonomously learning a (sound and complete) abstract representation directly from sensorimotor data, and then using it to plan. I will also argue that this re-representation step is a critical component of solving the general AI problem.

Accepted Papers

Oral Only

Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining, Yiyu Qian, Tim Miller, Liyuan Zhao
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym, Ngoc La, Ruaridh Mon-Williams
Planning with temporally-extended actions, Palash Chatterjee, Roni Khardon
Contextual bandits for maximizing stimulated word-of-mouth rewards, AHMED SAYEED FARUK, Elena Zheleva
Active Teacher Selection for Reinforcement Learning from Human Feedback, Rachel Freedman, Justin Svegliato, Kyle Hollins Wray, Stuart Russell
AI Planning: A Primer and Survey (Preliminary Report), Dillon Ze Chen, Pulkit Verma, Siddharth Srivastava, Michael Katz, Sylvie Thiebaux
Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning, Huchen Jiang, Yangyang Ma, CHAOFAN DING, Kexin Luan, XINHAN DI
A Benchmark for Hierarchical Parameterized Action Markov Decision Process, Dengxian Yang, Neil Michael Dundon, Elizabeth J Rizor, Scott T. Grafton, Linda Ruth Petzold
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control, Devdhar Patel, Hava T Siegelmann
Controller Synthesis from Deep Reinforcement Learning Policies, Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, Guillermo Perez

Poster Only

Liner Shipping Network Design with Reinforcement Learning, Utsav Dutta, Yifan Lin, Zhaoyang Larry Jin
ContextFormer: Stitching via Expert Calibration, Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang
RELAX: Reinforcement Learning Enabled 2D-LiDAR based Autonomous System for Parsimonious UAVs, Guanlin Wu, Zhuokai Zhao, Huan Chen, Jinyi Zhao, Yangke Zhang, Yutao He
Neurosymbolic Reinforcement Learning With Sequential Guarantees, Lennert De Smet, Gabriele Venturato, Luc De Raedt, Giuseppe Marra
SPRIG: Stackelberg Perception-Reinforcement Learning with Internal Game Dynamics, Fernando Martinez, Juntao Chen, Yingdong Lu
Networked Restless Multi-Arm Bandits with Reinforcement Learnin, Hanmo Zhang, Kai Wang
Average-Reward Reinforcement Learning with Entropy Regularizatio, Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, Rahul V Kulkarni

Organizing Committee

Zlatan Ajanović, RWTH Aachen, Aachen, Germany
Timo P. Gros, German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany
Floris den Hengst, University of Amsterdam, Amsterdam, Netherlands
Daniel Höller, Saarland University, Saarbrücken, Germany
Harsha Kokel, IBM Research, San Jose, USA
Ayal Taitler, Ben-Gurion University, Be’er Sheva, Israel

Please send your inquiries to prl.theworkshop@gmail.com

< Link to other workshops in the series

PRL @ AAAI 2025

Aim and Scope

Program

Schedule

Keynotes

Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning

Anders Jonsson

Full Professor, Universitat Pompeu Fabra

Abstract

Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges

Marlos Machado

Assistant Professor, University of Alberta

Abstract

Signal to Symbol (via Skills)

George Konidaris

Associate Professor, Brown University

Abstract

Accepted Papers

Oral Only

Poster Only

Organizing Committee

Signal to Symbol  (via Skills)