来自谷歌Batch Normalization原作者的论文《Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models》在训练的时候引入两个新的变换参数r和d,降低了对minibatch的dependence,以解决BN在处理batchsize小或者样本非iid情况下所存在的问题。
摘要:
Batch Normalization is quite effective at accelerating and improving the
training of deep models. However, its effectiveness diminishes when the
training minibatches are small, or do not consist of independent samples. We
hypothesize that this is due to the dependence of model layer inputs on all the
examples in the minibatch, and different activations being produced between
training and inference. We propose Batch Renormalization, a simple and
effective extension to ensure that the training and inference models generate
the same outputs that depend on individual examples rather than the entire
minibatch. Models trained with Batch Renormalization perform substantially
better than batchnorm when training with small or non-i.i.d. minibatches. At
the same time, Batch Renormalization retains the benefits of batchnorm such as
insensitivity to initialization and training efficiency.
链接:
https://arxiv.org/abs/1702.03275
原文链接:
http://weibo.com/1785748853/EvrkLss7n?from=page_1005051785748853_profile&wvr=6&mod=weibotime&type=comment