Computer Science > Machine Learning

arXiv:2405.05890 (cs)

[Submitted on 9 May 2024]

Title:Safe Exploration Using Bayesian World Models and Log-Barrier Optimization

Authors:Yarden As, Bhavya Sukhija, Andreas Krause

Abstract:A major challenge in deploying reinforcement learning in online tasks is ensuring that safety is maintained throughout the learning process. In this work, we propose CERL, a new method for solving constrained Markov decision processes while keeping the policy safe during learning. Our method leverages Bayesian world models and suggests policies that are pessimistic w.r.t. the model's epistemic uncertainty. This makes CERL robust towards model inaccuracies and leads to safe exploration during learning. In our experiments, we demonstrate that CERL outperforms the current state-of-the-art in terms of safety and optimality in solving CMDPs from image observations.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.05890 [cs.LG]
	(or arXiv:2405.05890v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.05890

Submission history

From: Yarden As [view email]
[v1] Thu, 9 May 2024 16:42:39 UTC (411 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2024-05

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Safe Exploration Using Bayesian World Models and Log-Barrier Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Safe Exploration Using Bayesian World Models and Log-Barrier Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators