专栏名称: 机器学习研究会
机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织,旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外,协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。
目录
相关文章推荐
爱可可-爱生活  ·  【Thinking ... ·  2 天前  
爱可可-爱生活  ·  几篇论文实现代码:《Scaling ... ·  2 天前  
爱可可-爱生活  ·  【Promptim 提示优化器:用于优化 ... ·  2 天前  
51好读  ›  专栏  ›  机器学习研究会

【论文】特征学习顶会ICLR 2017三篇最佳论文

机器学习研究会  · 公众号  · AI  · 2017-02-25 20:02

正文



点击上方“机器学习研究会”可以订阅哦
摘要
 

转自:王威廉

论文《Understanding deep learning requires rethinking generalization》摘要:

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. 

Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. 

We interpret our experimental findings by comparison with traditional models.


链接:

https://openreview.net/forum?id=Sy8gdB9xx¬eId=Sy8gdB9xx


论文《Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data》摘要:

Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. 

To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data. The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as ``teachers'' for a ``student'' model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student's privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student's training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings. 

Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning.


链接:

https://openreview.net/forum?id=HkwoSDPgg¬eId=HkwoSDPgg


论文《Making Neural Programming Architectures Generalize via Recursion》摘要:

Empirically, neural networks that attempt to learn programs from data have exhibited poor generalizability. Moreover, it has traditionally been difficult to reason about the behavior of these models beyond a certain level of input complexity. In order to address these issues, we propose augmenting neural architectures with a key abstraction: recursion. As an application, we implement recursion in the Neural Programmer-Interpreter framework on four tasks: grade-school addition, bubble sort, topological sort, and quicksort. We demonstrate superior generalizability and interpretability with small amounts of training data. Recursion divides the problem into smaller pieces and drastically reduces the domain of each neural network component, making it tractable to prove guarantees about the overall system’s behavior. Our experience suggests that in order for neural architectures to robustly learn program semantics, it is necessary to incorporate a concept like recursion.


链接:

https://openreview.net/forum?id=BkbY4psgg¬eId=BkbY4psgg


原文链接:

http://weibo.com/1657470871/Ex8jvqidM?type=comment

“完整内容”请点击【阅读原文】
↓↓↓