专栏名称: 机器学习研究会

机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织，旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外，协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。

【学习】用数学和神经科学为深度学习奠定基础

机器学习研究会 · 公众号 · AI · 2017-06-05 19:47

正文

点击上方“机器学习研究会”可以订阅哦

摘要

转自：网路冷眼

foundations for deep learning:

I emphasize mathematical/conceptual foundations because implementations of ideas(ex. Torch, Tensorflow) will keep evolving but the underlying theory must be sound. Anybody with an interest in deep learning can and should try to understand why things work.
I include neuroscience as a useful conceptual foundation for two reasons. First, it may provide inspiration for future models and algorithms. Second, the success of deep learning can contribute to useful hypotheses and models for computational neuroscience.
Information Theory is also a very useful foundation as there's a strong connection between data compression and statistical prediction. In fact, data compressors and machine learning models approximate Kolmogorov Complexity which is the ultimate data compressor.

You might notice that I haven't emphasized the latest benchmark-beating paper. My reason for this is that a good theory ought to be scalable which means that it should be capable of explaining why deep models generalise and we should be able to bootstrap these explanations for more complex models(ex. sequences of deep models(aka RNNs)). This is how all good science is done.

Classics:

Learning Deep Generative Models(Salakhutdinov 2015)
Uncertainty in Deep Learning(Yarin Gal 2017)
Markov Chain Monte Carlo and Variational Inference: Bridging the Gap (Salimans 2014)
Weight Normalization (Salimans 2016)
Mixture Density Networks (Bishop 1994)
Dropout as a Bayesian Approximation(Yarin Gal 2016)
Learning Deep Generative Models(Salakhutdinov 2015)
Why does unsupervised pre-training help deep learning?(Erhan et al. 2010)
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(Ioffe 2015)
Bayesian Back-Propagation (W. Buntine & A. Weigend 1991)

Mathematical papers:

Dropout Rademacher Complexity of Deep Neural Networks(Wei Gao 2014)
Distribution-Specific Hardness of Learning Neural Networks(Shamir 2017)
Lessons from the Rademacher Complexity for Deep Learning(Sokolic 2016)
A mathematical theory of Deep Convolutional Neural Networks for Feature Extraction(Wiatowski 2016)
Spectral Representations for Convolutional Neural Networks(Rippl 2015)
Electron-Proton dynamics in deep learning(Zhang 2017)
Empirical Risk Minimization for Learning Theory(Vapnik 1991)
The Loss Surfaces of Multilayer Networks(Y LeCun et al. 2015)
Understanding Synthetic Gradients and Decoupled Neural Interfaces(W. Czarnecki 2017)
Dataset Shift(Storkey 2013)
Risk vs Uncertainty(I. Osband 2016)
The loss surface of deep and wide neural networks(Q. Nguyen 2017)

Information Theory papers:

Shannon Information and Kolmogorov Complexity (Grunwald 2010)
Discovering Neural Nets with Low Kolmogorov Complexity(Schmidhuber 1997)
Opening the black box of Deep Neural Networks via Information (Schwartz-Ziv 2017)

Neuroscience papers:

Towards an integration of deep learning and neuroscience(Marblestone 2016)
Equilibrium Propagation(Scellier 2016)
Biologically plausible deep learning(Bengio 2015)
Random synaptic feedback weights(Lillicrap 2016)
Deep learning with spiking neurons(Mesnard 2016)
Towards deep learning with spiking dendrites(Guergiuev 2017)
Variational learning for recurrent spiking networks(Rezende 2011)
A view of Neural Networks as dynamical systems(Cessac 2009)

Note: This is a work in progress. I have a lot more papers to add.

链接：

https://github.com/pauli-space/foundations_for_deep_learning

原文链接：

http://weibo.com/1715118170/F6r2LmyTt?ref=home&rid=7_0_202_2606715845051415695&type=comment#_rnd1496661417606

“完整内容”请点击【阅读原文】

↓↓↓