PRL Workshop Series

Bridging the Gap Between AI Planning and Reinforcement Learning

PRL @ AAAI 2025

AAAI’25
Philadelphia, Pennsylvania, USA
Date: March 4, 2025, 8:30–6:00
Room: 120C
prl.theworkshop@gmail.com

❗Important Announcement

Note that PRL will be a full day workshop starting at 8:30 AM and ending at 6:00 PM.

Aim and Scope

While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations.

This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and reinforcement learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and reinforcement learning, in particular, those that focus on intelligent decision-making. This is the eigth edition of the PRL workshop series that started at ICAPS 2020.

Program

The workshop location is Room 120C of the Pennsylvania Convention Center in Philadelphia, Pennsylvania.

Schedule

Time (Philadelphia) Title
8:30 - 10:30 am Session I
  Opening Remarks
  AI Planning: A Primer and Survey (Preliminary Report). Dillon Ze Chen, Pulkit Verma, Siddharth Srivastava, Michael Katz, and Sylvie Thiebaux.
  A Benchmark for Hierarchical Parameterized Action Markov Decision Process. Dengxian Yang, Neil Michael Dundon, Elizabeth J Rizor, Scott T. Grafton, Linda Ruth Petzold
  Keynote Marlos Machado:⭐
Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges
  Contextual Bandits for Maximizing Stimulated Word-of-Mouth Rewards. Ahmed Sayeed Faruk, and Elena Zheleva.
  Planning with temporally-extended actions. Palash Chatterjee, and Roni Khardon.
10:30 - 11:00 Coffee break
11:00 - 12:30 Poster Session
12:30 - 14:00 Lunch
14:00 - 15:30 Session II
  Keynote Anders Jonsson: ⭐
Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning
  Active Teacher Selection for Reinforcement Learning from Human Feedback. Rachel Freedman, Justin Svegliato, Kyle Hollins Wray, and Stuart Russell.
  HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym. Ngoc La, and Ruaridh Mon-Williams
15:30 - 16:00 Coffee break
16:00 - 18:00 Session III
  Keynote George Konidaris: ⭐
Signal to Symbol
 (via Skills)
  Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. Devdhar Patel, and Hava T Siegelmann.
  Controller Synthesis from Deep Reinforcement Learning Policies. Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, and Guillermo Perez.
  Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning. Huchen Jiang, Yangyang Ma, CHAOFAN DING, Kexin Luan, and XINHAN DI.
  Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining. Yiyu Qian, Tim Miller, and Liyuan Zhao.
  Closing Remarks

Keynotes

Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning

Anders Jonsson

Full Professor, Universitat Pompeu Fabra

Abstract

A well-known limitation of reinforcement learning is its high sample complexity, which causes learning to be slow in complex tasks. The situation is even worse in problems with non-Markovian dynamics, i.e. when the ability to predict the future depends on the entire interaction history. In this talk I will describe two recent lines of work that improve the sample complexity of reinforcement learning by exploiting the symbolic structure of tasks. The first line of work is to train a set of policies for solving subtasks in an existing hierarchy, and combine these policies for fast or zero-shot learning of more complex, global tasks. The second line of work is to learn finite-state automata for non-Markovian decision processes, providing symbolic information that compactly captures the interaction history.

Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges

Marlos Machado

Assistant Professor, University of Alberta

Abstract

The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modelled through temporally extended courses of action called options. Despite their popularity as a research topic, options are rarely a core component of traditional RL solutions. In this talk, I will introduce a general framework for option discovery that leverages the agent’s representation to identify useful options. By using these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively in model-free settings across diverse environments with varying topologies and observation spaces. However, options are far less common–and often less effective–in planning. I will also present insights into making options more useful for planning and the challenges posed by function approximation in this setting.

Signal to Symbol
 (via Skills)

George Konidaris

Associate Professor, Brown University

Abstract

I will address the question of how an RL agent with a rich sensorimotor space can learn abstract, task-specific representations of a particular task, and the conditions under which such representations match classical planning paradigms. I will tale a constructivist approach, where the computation the representation is required to support - here, planning using a given set of motor skills - is precisely defined, and then its properties are used to build the representation so that it is capable of doing so by construction. The result is a formal link between the skills available to a robot and the symbols it should use to plan with them. I will present an example of a robot autonomously learning a (sound and complete) abstract representation directly from sensorimotor data, and then using it to plan. I will also argue that this re-representation step is a critical component of solving the general AI problem.

Accepted Papers

Oral Only

Poster Only

Organizing Committee

Please send your inquiries to prl.theworkshop@gmail.com





< Link to other workshops in the series