PRL @ AAAI 2025
AAAI’25
Philadelphia, Pennsylvania, USA
Date: March 4, 2025, 8:30–6:00
Room: 120C
prl.theworkshop@gmail.com
❗Important Announcement
Note that PRL will be a full day workshop starting at 8:30 AM and ending at 6:00 PM.
Aim and Scope
While AI Planning and Reinforcement Learning communities focus on similar sequential decision-making problems, these communities remain somewhat unaware of each other on specific problems, techniques, methodologies, and evaluations.
This workshop aims to encourage discussion and collaboration between researchers in the fields of AI planning and reinforcement learning. We aim to bridge the gap between the two communities, facilitate the discussion of differences and similarities in existing techniques, and encourage collaboration across the fields. We solicit interest from AI researchers that work in the intersection of planning and reinforcement learning, in particular, those that focus on intelligent decision-making. This is the eigth edition of the PRL workshop series that started at ICAPS 2020.
Program
The workshop location is Room 120C of the Pennsylvania Convention Center in Philadelphia, Pennsylvania.
Schedule
Time (Philadelphia) | Title |
---|---|
8:30 - 10:30 am | Session I |
Opening Remarks | |
AI Planning: A Primer and Survey (Preliminary Report). Dillon Ze Chen, Pulkit Verma, Siddharth Srivastava, Michael Katz, and Sylvie Thiebaux. | |
A Benchmark for Hierarchical Parameterized Action Markov Decision Process. Dengxian Yang, Neil Michael Dundon, Elizabeth J Rizor, Scott T. Grafton, Linda Ruth Petzold | |
⭐ Keynote Marlos Machado:⭐ Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges |
|
Contextual Bandits for Maximizing Stimulated Word-of-Mouth Rewards. Ahmed Sayeed Faruk, and Elena Zheleva. | |
Planning with temporally-extended actions. Palash Chatterjee, and Roni Khardon. | |
10:30 - 11:00 | Coffee break |
11:00 - 12:30 | Poster Session |
12:30 - 14:00 | Lunch |
14:00 - 15:30 | Session II |
⭐ Keynote Anders Jonsson: ⭐ Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning |
|
Active Teacher Selection for Reinforcement Learning from Human Feedback. Rachel Freedman, Justin Svegliato, Kyle Hollins Wray, and Stuart Russell. | |
HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym. Ngoc La, and Ruaridh Mon-Williams | |
15:30 - 16:00 | Coffee break |
16:00 - 18:00 | Session III |
⭐ Keynote George Konidaris: ⭐ Signal to Symbol (via Skills) |
|
Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control. Devdhar Patel, and Hava T Siegelmann. | |
Controller Synthesis from Deep Reinforcement Learning Policies. Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, and Guillermo Perez. | |
Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning. Huchen Jiang, Yangyang Ma, CHAOFAN DING, Kexin Luan, and XINHAN DI. | |
Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining. Yiyu Qian, Tim Miller, and Liyuan Zhao. | |
Closing Remarks |
Keynotes
Exploiting Symbolic Structure and Hierarchy in Reinforcement Learning
Anders Jonsson
Full Professor, Universitat Pompeu Fabra
Abstract
A well-known limitation of reinforcement learning is its high sample complexity, which causes learning to be slow in complex tasks. The situation is even worse in problems with non-Markovian dynamics, i.e. when the ability to predict the future depends on the entire interaction history. In this talk I will describe two recent lines of work that improve the sample complexity of reinforcement learning by exploiting the symbolic structure of tasks. The first line of work is to train a set of policies for solving subtasks in an existing hierarchy, and combine these policies for fast or zero-shot learning of more complex, global tasks. The second line of work is to learn finite-state automata for non-Markovian decision processes, providing symbolic information that compactly captures the interaction history.
Representation-Driven Option Discovery in RL: Model-Free Success & Model-Based Challenges
Marlos Machado
Assistant Professor, University of Alberta
Abstract
The ability to reason at multiple levels of temporal abstraction is a fundamental aspect of intelligence. In reinforcement learning, this attribute is often modelled through temporally extended courses of action called options. Despite their popularity as a research topic, options are rarely a core component of traditional RL solutions. In this talk, I will introduce a general framework for option discovery that leverages the agent’s representation to identify useful options. By using these options to generate a rich stream of experience, the agent can improve its representations and learn more effectively in model-free settings across diverse environments with varying topologies and observation spaces. However, options are far less common–and often less effective–in planning. I will also present insights into making options more useful for planning and the challenges posed by function approximation in this setting.
Signal to Symbol (via Skills)
George Konidaris
Associate Professor, Brown University
Abstract
I will address the question of how an RL agent with a rich sensorimotor space can learn abstract, task-specific representations of a particular task, and the conditions under which such representations match classical planning paradigms. I will tale a constructivist approach, where the computation the representation is required to support - here, planning using a given set of motor skills - is precisely defined, and then its properties are used to build the representation so that it is capable of doing so by construction. The result is a formal link between the skills available to a robot and the symbols it should use to plan with them. I will present an example of a robot autonomously learning a (sound and complete) abstract representation directly from sensorimotor data, and then using it to plan. I will also argue that this re-representation step is a critical component of solving the general AI problem.
Accepted Papers
Oral Only
- Exploring Explainable Multi-player MCTS-minimax Hybrids in Board Game Using Process Mining, Yiyu Qian, Tim Miller, Liyuan Zhao
- HDDLGym: A Tool for Studying Multi-Agent Hierarchical Problems Defined in HDDL with OpenAI Gym, Ngoc La, Ruaridh Mon-Williams
- Planning with temporally-extended actions, Palash Chatterjee, Roni Khardon
- Contextual bandits for maximizing stimulated word-of-mouth rewards, AHMED SAYEED FARUK, Elena Zheleva
- Active Teacher Selection for Reinforcement Learning from Human Feedback, Rachel Freedman, Justin Svegliato, Kyle Hollins Wray, Stuart Russell
- AI Planning: A Primer and Survey (Preliminary Report), Dillon Ze Chen, Pulkit Verma, Siddharth Srivastava, Michael Katz, Sylvie Thiebaux
- Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning, Huchen Jiang, Yangyang Ma, CHAOFAN DING, Kexin Luan, XINHAN DI
- A Benchmark for Hierarchical Parameterized Action Markov Decision Process, Dengxian Yang, Neil Michael Dundon, Elizabeth J Rizor, Scott T. Grafton, Linda Ruth Petzold
- Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control, Devdhar Patel, Hava T Siegelmann
- Controller Synthesis from Deep Reinforcement Learning Policies, Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowe, Guillermo Perez
Poster Only
- Liner Shipping Network Design with Reinforcement Learning, Utsav Dutta, Yifan Lin, Zhaoyang Larry Jin
- ContextFormer: Stitching via Expert Calibration, Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang
- RELAX: Reinforcement Learning Enabled 2D-LiDAR based Autonomous System for Parsimonious UAVs, Guanlin Wu, Zhuokai Zhao, Huan Chen, Jinyi Zhao, Yangke Zhang, Yutao He
- Neurosymbolic Reinforcement Learning With Sequential Guarantees, Lennert De Smet, Gabriele Venturato, Luc De Raedt, Giuseppe Marra
- SPRIG: Stackelberg Perception-Reinforcement Learning with Internal Game Dynamics, Fernando Martinez, Juntao Chen, Yingdong Lu
- Networked Restless Multi-Arm Bandits with Reinforcement Learnin, Hanmo Zhang, Kai Wang
- Average-Reward Reinforcement Learning with Entropy Regularizatio, Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, Rahul V Kulkarni
Organizing Committee
- Zlatan Ajanović, RWTH Aachen, Aachen, Germany
- Timo P. Gros, German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany
- Floris den Hengst, University of Amsterdam, Amsterdam, Netherlands
- Daniel Höller, Saarland University, Saarbrücken, Germany
- Harsha Kokel, IBM Research, San Jose, USA
- Ayal Taitler, Ben-Gurion University, Be’er Sheva, Israel
Please send your inquiries to prl.theworkshop@gmail.com