专栏名称: VALSE

VALSE（Vision and Learning Seminar）年度研讨会的主要目的是为计算机视觉、图像处理、模式识别与机器学习研究领域内的中国青年学者提供一个深层次学术交流的舞台。

【17-13期VALSE Webinar活动】

VALSE · 公众号 · · 2017-06-30 17:36

正文

报告嘉宾1： 董胤蓬（清华大学）

报告时间： 2017年07月05日（星期三）晚上20：00-20：20（北京时间）

报告题目： Improving Interpretability of Deep Neural Networks with Semantic Information

主持人： 苏航（清华大学）

报告摘要：

深度学习的可解释性（Interpretability）问题在深度神经网络的发展过程中发挥着至关重要的作用。可解释学习能够使我们了解模型的优势和缺陷、预知模型未来的行为、修复模型中可能存在的问题。但是由于深度学习的黑箱特性，我们很难了解深度神经网络内部的工作机理。为了解决深度神经网络不可解释的问题，我们提出利用人类描述中丰富的语义信息来提升深度神经网络的可解释性。在自动生成视频描述（Video Captioning）的任务中，我们首先从描述中提取主题表示作为有意义的语义信息，然后通过可解释性误差（Interpretive Loss）将其纳入到网络的训练过程中。我们提出了预测差异最大化（Prediction Difference Maximization）的方法解释神经元学习的特征。实验结果展示出通过这种方法，我们可以得到可解释的特征，同时也可以帮助提高任务性能。通过充分理解深度神经网络内部的特征表示，用户通过人在环（Human-in-the-loop）过程可以简单地修复模型的错误。

报告人简介：

董胤蓬，清华大学计算机科学与技术系人工智能研究所，博士一年级，导师为朱军副教授。研究方向为机器学习、深度学习及其在计算机视觉中的应用，目前关注于深度神经网络的可解释性与鲁棒性分析。2016年6月至9月赴美国卡耐基梅隆大学Fernando教授组里访问。

Yinpeng Dong is a first-year Ph.D. student of TSAIL Group in the Department of Computer Science and Technology, Tsinghua University, advised by Prof. Jun Zhu. His research interest includes machine learning, deep learning and their applications in computer vision. Recently, he is interested in interpretability and robustness of deep neural networks. He was a visiting student from June, 2016 to September, 2016 in the Robotics Institute, Carnegie Mellon University, advised by Prof. Fernando De la Torre.

个人主页：http:/ml.cs.tsinghua.edu.cn/~yinpeng

报告嘉宾2： 贾奎（华南理工大学、电子与信息学院）

报告时间： 2017年07月05日（星期三）晚上20：20-20：40（北京时间）

报告题目： 深度神经网络卷积核优化探究（Improving training of deep neural networks via Singular Value Bounding）

主持人： 苏航（清华大学）

报告摘要：

In this work, we focus on investigation of the network solution properties that can potentially lead to good performance. Our research is inspired by theoretical and empirical results that use orthogonal matrices to initialize networks, but we are interested in investigating how orthogonal weight matrices perform when network training converges. To this end, we propose to constrain the solutions of weight matrices in the orthogonal feasible set during the whole process of network training, and achieve this by a simple yet effective method called Singular Value Bounding (SVB). In SVB, all singular values of each weight matrix are simply bounded in a narrow band around the value of 1. Based on the same motivation, we also propose Bounded Batch Normalization (BBN), which improves Batch Normalization by removing its potential risk of ill-conditioned layer transform. We present both theoretical and empirical results to justify our proposed methods. Experiments on benchmark image classification datasets show the efficacy of our proposed SVB and BBN. In particular, we achieve the state-of-the-art results of 3.06% error rate on CIFAR10 and 16.90% on CIFAR100, using off-the-shelf network architectures (Wide ResNets). Our preliminary results on ImageNet also show the promise in large-scale learning.

报告人简介：

贾奎，博士，华南理工大学电子与信息学院教授，博士生导师，主动感知与结构智能团队负责人，国家青年千人。2001年于西北工业大学获得学士学位，2004年于新加坡国立大学获得硕士学位，2007年于伦敦大学玛丽女王学院获得计算机科学博士学位。博士毕业后，曾先后于中科院深圳先进技术研究院、香港中文大学、伊利诺伊大学香槟分校新加坡高等研究院、及澳门大学从事教学和科研工作。贾奎博士的主要研究方向是计算机视觉、机器学习、图像处理、和模式识别等。在包括TPAMI，IJCV, TSP, TIP, ICCV, CVPR等在内的计算机视觉和模式识别顶级期刊和会议发表论文50余篇。

报告嘉宾3： 周博磊（麻省理工学院）

报告时间： 2017年07月05日（星期三）晚上20：40-21：00（北京时间）

报告题目： Network Dissection: Quantifying Interpretability of Deep Visual Representations.

主持人： 苏航（清华大学）

报告摘要：

We propose a general framework called Network Dissection for quantifying the interpretability of latent representations of CNNs by evaluating the alignment between individual hidden units and a set of semantic concepts. Given any CNN model, the proposed method draws on a broad data set of visual concepts to score the semantics of hidden units at each intermediate convolutional layer. The units with semantics are given labels across a range of objects, parts, scenes, textures, materials, and colors. We use the proposed method to test the hypothesis that interpretability of units is equivalent to random linear combinations of units, then we apply our method to compare the latent representations of various networks when trained to solve different supervised and self-supervised training tasks. We further analyze the effect of training iterations, compare networks trained with different initializations, examine the impact of network depth and width, and measure the effect of dropout and batch normalization on the interpretability of deep visual representations. We demonstrate that the proposed method can shed light on characteristics of CNN models and training methods that go beyond measurements of their discriminative power.

报告人简介：

Bolei Zhou is the 5th-year Ph.D. Candidate in Computer Science and Artificial Intelligence Laboratory at MIT, working with Prof. Antonio Torralba. His research is on computer vision and machine learning, with particular interest in visual scene understanding and network interpretability. He is the award recipient of the Facebook Fellowship，Microsoft Research Asia Fellowship，MIT Greater China Fellowship. More details about his research work is at the homepage：http://people.csail.mit.edu/bzhou/.

报告嘉宾4： 许洪腾（佐治亚理工（Georgia Tech）电气与计算机工程博士毕业生）

报告时间： 2017年07月05日（星期三）晚上21：00-21：20（北京时间）

报告题目： Fractal Dimension Invariant Filtering and Its CNN-based Implementation

主持人： 严骏驰（IBM，华东师范大学）

报告摘要：

分形图像模型与分析技术在计算机视觉领域，尤其在纹理图像综合与分析问题中具有非常成功的应用。其核心思想是通过分形维数的双李普希兹不变性构建具有良好鲁棒性的特征对图像进行表示和描述。但是，分形维数的不变性往往不能在卷积或滤波操作下得到保持，限制了这一技术在深度学习大背景下的应用。

针对这一问题，我们设计了一种具有分形维数不变性的非线性滤波技术（FDIF）。该滤波器先通过各向异性的自适应滤波保持了局部分形维数统计意义上的不变性，再通过非线性变换对滤波后的图像测度进行调整，进而保证在不同测度下（即滤波前后）的局部分形维数不变。该滤波技术可以利用现有的卷积神经网络架构近似实现，建立了特定神经网络结构与分形图像分析技术的联系，给出了特定神经网络的基于分形的几何解释。我们将该技术应用于低信噪比材料图像的分析问题中，实现了对材料复杂结构的鲁棒提取。

报告人简介：

许洪腾，佐治亚理工电气与计算机工程系，博士毕业生（将于2017年8月毕业）。主要从事机器学习及其在信号处理、数据挖掘和计算机视觉等方面应用的研究，重点包括点过程模型，流形学习和序列分析技术。获得6项发明专利，发表国际期刊和会议论文近30篇（包括ICML、CVPR、ICCV、IJCAI、AAAI、TKDE、TIP、TMM、TPAMI等）。基于研究成果获得了ICCV 2013和ICML 2016的旅行补助，并进入2016年度百度奖学金终选名单。

Hongteng Xu is a Ph.D. candidate in School of Electrical and Computer Engineering, Geor- gia Tech. His primary research interest is machine learning and its applications of signal processing, data mining and computer vision, especially point process models and learning algorithms for synchronous/asynchronous event sequences analysis and prediction. Hongteng has received several achievements for his research, including 6 patents and near 30 publications on top conferences (ICML, IJCAI, AAAI, CVPR, ICCV) and journals (TKDE, TIP, TMM, TPAMI), traveling award of ICCV 2013 and ICML 2016, and entering the finalist of Baidu Fellowship 2016.

个人主页：https://sites.google.com/site/htxu313/

【17-13期VALSE Webinar活动】

正文

请到「今天看啥」查看全文