Fusion of high-level symbolic reasoning with lower level signal-based reasoning has become a paramount research question with the recent progress of deep learning. We recently developed a combination of a higher-order symbolic reasoner, a planner, with a signal-based RL system, called RePReL, to effectively construct abstractions to accelerate learning in structured domains (with several interacting objects that cannot be efficiently represented using a fixed-length vector). RL in structured domains is inherently a difficult task and only a small number of solutions exist1. The RePReL system takes a first step in the direction of combining (relational) planning and RL in solving structured problems by using the planner to define a smaller set of (abstract) state-action spaces to allow for efficient learning by the lower level RL agent. The key success of the RePReL method lies in its capability of generalization to a varying number of objects.

RePReL

The RePReL architecture, shown below, consists of three stacked modules: Symbolic Planner, Abstraction Reasoner, and RL agents.

RePReL Architecture


  1. Symbolic Planner: The symbolic planner uses the high-level planning domain description to decompose the goal into a sequence of temporally extended actions. Essentially, the planner decomposes the GRMDP into small sub-goal RMDPs.
  2. Abstraction Reasoner: The abstraction reasoner generates a task-specific abstract state representation using the dynamic first-order conditional influence statements provided by a domain expert.
  3. RL Agents: Finally, multiple reinforcement learners at the lowest level learn separate RL policies for each option in the abstract state space.

While RePReL was successful, it had an important assumption—the underlying features are discrete and homogeneous. The discrete assumption restricts the RePReL from exploiting the power of deep RL and the homogeneous assumption restricts the use of RePReL in scenarios where the data could arrive from multiple sources. In this work, we propose an extention to RePReL architecture that can easily adapt to hybrid data, i.e. a conglomeration of many varied types of data coming from different sources.

Hybrid Deep RePReL

In the Hybrid Deep RePReL architecture, shown below, we introduce two additional modules, an input preprocessing module and a merge module, to handle the combination of structured and unstructured information.

Hybrid Deep RePReL Architecture


The unstructured part of the state space is passed through the input pre-processing module., that generates a latent state embeddings for the unstructured data. The input pre-processing module can be a Convolutional Neural Network for image data, a transformer for text data, or a combination of both. The relevant state variables obtained from the abstraction reasoner and the latent predicates obtained from the input preprocessing modules are combined by the merge module and provided to the deep RL agents for learning.

Implementation of this work is available our lab github here.

Citation

If you build on this code or the ideas of this paper, please use the following citation.

@inproceedings{KokelPRBTN22,
  title={Hybrid Deep RePReL: Integrating Relational Planning and Reinforcement Learning for Information Fusion},
  author={Harsha Kokel and Nikhilesh Prabhakar and Balaraman Ravindran and Eric Blasch and Prasad Tadepalli and Sriraam Natarajan},
  booktitle={IEEE 25th International Conference on Information Fusion (FUSION)}, 
  year={2022}
}

Acknowledgements

HK, NP, and SN gratefully acknowledge the support of ARO award W911NF2010224 and AFOSR award FA9550-18-1-0462. PT acknowledges support of DARPA contract N66001-17-2-4030 and NSF & USDA-NIFA under grant 2021-67021-35344. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the AFOSR, ARO, NSF, DARPA or the U.S. government.