Electrical Engineering and Systems Science > Systems and Control

arXiv:2306.17639 (eess)

[Submitted on 30 Jun 2023 (v1), last revised 7 Aug 2024 (this version, v2)]

Title:Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms

Authors:Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska

View PDF

Abstract:The increasing trend to integrate neural networks and conventional software components in safety-critical settings calls for methodologies for their formal modelling, verification and correct-by-construction policy synthesis. We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs), a variant of continuous-state POMDPs with discrete observations and actions, in which the agent perceives a continuous-state environment using a neural {\revise perception mechanism} and makes decisions symbolically. The perception mechanism classifies inputs such as images and sensor values into symbolic percepts, which are used in decision making.
We study the problem of optimising discounted cumulative rewards for NS-POMDPs. Working directly with the continuous state space, we exploit the underlying structure of the model and the neural perception mechanism to propose a novel piecewise linear and convex representation (P-PWLC) in terms of polyhedra covering the state space and value vectors, and extend Bellman backups to this representation. We prove the convexity and continuity of value functions and present two value iteration algorithms that ensure finite representability. The first is a classical (exact) value iteration algorithm extending the $\alpha$-functions of Porta {\em et al} (2006) to the P-PWLC representation for continuous-state spaces. The second is a point-based (approximate) method called NS-HSVI, which uses the P-PWLC representation and belief-value induced functions to approximate value functions from below and above for two types of beliefs, particle-based and region-based. Using a prototype implementation, we show the practical applicability of our approach on two case studies that employ (trained) ReLU neural networks as perception functions, by synthesising (approximately) optimal strategies.

Comments:	65 pages, 14 figures
Subjects:	Systems and Control (eess.SY); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.17639 [eess.SY]
	(or arXiv:2306.17639v2 [eess.SY] for this version)
	https://doi.org/10.48550/arXiv.2306.17639

Submission history

From: Rui Yan [view email]
[v1] Fri, 30 Jun 2023 13:26:08 UTC (5,210 KB)
[v2] Wed, 7 Aug 2024 08:10:23 UTC (12,611 KB)

Electrical Engineering and Systems Science > Systems and Control

Title:Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Systems and Control

Title:Point-Based Value Iteration for POMDPs with Neural Perception Mechanisms

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators