专栏名称: 机器学习研究会
机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织,旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外,协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。
目录
相关文章推荐
爱可可-爱生活  ·  【[30星]Baichuan-Omni-1. ... ·  昨天  
爱可可-爱生活  ·  [CL]《Computing ... ·  2 天前  
爱可可-爱生活  ·  早! #早安# -20250126061352 ·  3 天前  
量子位  ·  曝DeepSeek让Llama4未发布已落后 ... ·  3 天前  
51好读  ›  专栏  ›  机器学习研究会

​ 【论文】推理 vs. 优化 —— 用随机梯度下降做近似贝叶斯推理

机器学习研究会  · 公众号  · AI  · 2017-04-18 20:27

正文



点击上方“机器学习研究会”可以订阅哦
摘要
 

转自:爱可可-爱生活

论文《Stochastic Gradient Descent as Approximate Bayesian Inference》摘要:

Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distribution to a posterior, minimizing the Kullback-Leibler divergence between these two distributions. (2) We demonstrate that constant SGD gives rise to a new variational EM algorithm that optimizes hyperparameters in complex probabilistic models. (3) We also propose SGD with momentum for sampling and show how to adjust the damping coefficient accordingly. (4) We analyze MCMC algorithms. For Langevin Dynamics and Stochastic Gradient Fisher Scoring, we quantify the approximation errors due to finite learning rates. Finally (5), we use the stochastic process perspective to give a short proof of why Polyak averaging is optimal. Based on this idea, we propose a scalable approximate MCMC algorithm, the Averaged Stochastic Gradient Sampler.


链接:

https://arxiv.org/abs/1704.04289


原文链接:

http://weibo.com/1402400261/EF1BD5Hs0?from=page_1005051402400261_profile&wvr=6&mod=weibotime&type=comment

“完整内容”请点击【阅读原文】
↓↓↓