ML Seminar: Stein Variational Gradient Descent: Algorithm, Theory, Applications
Abstract: Approximate probabilistic inference is a key computational task in modern machine learning, which allows us to reason with complex, structured, hierarchical (deep) probabilistic models to extract information and quantify uncertainty. Traditionally, approximate inference is often performed by either Markov chain Monte Carlo (MCMC) and variational inference (VI), both of which, however, have their own critical weaknesses: MCMC is accurate and asymptotically consistent but suffers from slow convergence; VI is typically faster by formulating inference problem into gradient-based optimization, but introduces deterministic errors and lacks theoretical guarantees. Stein variational gradient descent (SVGD) is a new tool for approximate inference that combines the accuracy and flexibility of MCMC and practical speed of VI and gradient-based optimization. The key idea of SVGD is to directly optimize a non-parametric particle-based representation to fit intractable distributions with fast deterministic gradient-based updates, which is made possible by integrating and generalizing key mathematical tools from Stein's method, optimal transport, and interacting particle systems. SVGD has been found a powerful tool in various challenging settings, including Bayesian deep learning and deep generative models, reinforcement learning, and meta learning. This talk will introduce the basic ideas and theories of SVGD, and cover some examples of application.