Browse by author
Lookup NU author(s): Congye Wang, Dr Heishiro Kanagawa, Professor Chris OatesORCiD
Full text for this publication is not currently held within this repository. Alternative links are provided below where available.
Copyright 2025 by the author(s).An informal observation, made by several authors, is that the adaptive design of a Markov transition kernel has the flavour of a reinforcement learning task. Yet, to-date it has remained unclear how to exploit modern reinforcement learning technologies for adaptive MCMC. The aim of this paper is to set out a general framework, called Reinforcement Learning Metropolis-Hastings, that is theoretically supported and empirically validated. Our principal focus is on learning fast-mixing Metropolis-Hastings transition kernels, which we cast as deterministic policies and optimise via a policy gradient. Control of the learning rate provably ensures conditions for ergodicity are satisfied. The methodology is used to construct a gradient-free sampler that out-performs a popular gradient-free adaptive Metropolis-Hastings algorithm on ≈ 90% of tasks in the PosteriorDB benchmark.
Author(s): Wang C, Chen W, Kanagawa H, Oates CJ
Publication type: Conference Proceedings (inc. Abstract)
Publication status: Published
Conference Name: 8th International Conference on Artificial Intelligence and Statistics
Year of Conference: 2025
Pages: 640-648
Online publication date: 03/05/2025
Acceptance date: 02/04/2018
ISSN: 2640-3498
Publisher: ML Research Press
URL: https://proceedings.mlr.press/v258/wang25b.html
Series Title: Proceedings of Machine Learning Research