RL in continuous-time diffusions: from the blessing of ellipticity to structure-driven algorithms
主讲人: Wenlong Mou(University of Toronto)
活动时间: 从 2025-05-28 10:30 到 11:30
场地: 请选择
Abstract: Reinforcement learning (RL) offers a powerful framework of decision-making in dynamical environments, and recent years have seen a surge of RL applied to continuous-time Markov diffusions. Yet, existing contraction-based RL theory suffers from limitations such as diverging effective horizons, and fails to provide the relevant guidance.
In this talk, I present some recent developments in the design of RL algorithms in controlled diffusion processes. I first discuss how uniform ellipticity -- a key property in Markov diffusions -- enables continuous-time RL with strong theoretical guarantees. Among other results, I highlight a self-mitigating statistical error bound which leads to non-standard bias-variance trade-offs. Drawing on this perspective, I extend the study to address fine-tuning of diffusion models, and introduce a novel PDE-based algorithm with sharp statistical guarantees.
Bio:Wenlong Mou is an Assistant Professor in the Department of Statistical Sciences at University of Toronto. In 2023, he received his Ph.D. degree in Electrical Engineering and Computer Sciences (EECS) from UC Berkeley. Prior to Berkeley, he received his B.Sc. degree in Computer Science and B.A. degree in Economics, both from Peking University. Wenlong's research interests include machine learning theory, mathematical statistics, optimization, and applied probability. He is particularly interested in data-driven decision-making in modern AI paradigms. His works have been published in many leading journals in statistical machine learning. His research has been recognized by the INFORMS Applied Probability Society as a Best Student Paper finalist.