专栏名称: VALSE

VALSE（Vision and Learning Seminar）年度研讨会的主要目的是为计算机视觉、图像处理、模式识别与机器学习研究领域内的中国青年学者提供一个深层次学术交流的舞台。

【17-25期VALSE Webinar活动】

VALSE · 公众号 · · 2017-10-15 20:59

正文

VALSE ICCV2017 专场重磅来袭：两年一度的视觉盛宴ICCV2017即将上演，为了更好的促进学术交流，VALSE Webinar将连续举行3场ICCV Pre-Conference专场，奉上最新鲜的ICCV2017论文，提前引燃本年度的ICCV热潮。

第三场10月18日，将有5篇报告：

报告嘉宾1： 蔡思佳(Hong Kong Polytechnic University) 报告时间： 2017年10月18日（星期三）晚20:00（北京时间）

报告题目： Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization

主持人： 刘日升（大连理工大学）

报告摘要：

The success of ﬁne-grained visual categorization (FGVC) extremely relies on the modelling of appearance and interactions of various semantic parts. This makes FGVC very challenging because: (i) part annotation and detection require expert guidance and are very expensive; (ii) parts are of different sizes; and (iii) the part interactions are complex and of higher-order. To address these issues, we propose an end-to-end framework based on higher-order integration of hierarchical convolutional activations for FGVC. By treating the convolutional activations as local descriptors, hierarchical convolutional activations can serve as a representation of local parts from different scales. A polynomial kernel based predictor is proposed to capture higher-order statistics of convolutional activations for modelling part interaction. To model inter-layer part interactions, we extend polynomial predictor to integrate hierarchical activations via kernel fusion. Our work also provides a new perspective for combining convolutional activations from multiple layers. While hyper-columns simply concatenate maps from different layers, and holistically-nested network uses weighted fusion to combine side-outputs, our approach exploits higher-order intra-layer and inter-layer relations for better integration of hierarchical convolutional features. The proposed framework yields more discriminative representation and achieves competitive results on the widely used FGVC datasets.

参考文献：

[1] Sijia Cai, Wangmeng Zuo and Lei Zhang, " Higher-order Integration of Hierarchical Convolutional Activations for Fine-grained Visual Categorization ", in IEEE International Conference on Computer Vision, Venice, Italy, 2017.

报告人简介：

Sijia Cai received his B.S. and M.S. degrees from Tianjin University in 2011 and 2014, respectively. He is currently a Ph.D. candidate in Prof. Lei Zhang’s group at the Hong Kong Polytechnic University. His research interests include optimization methods and machine learning algorithms for computer vision applications.

报告嘉宾2： 毛旭东（香港城市大学）

报告时间： 2017年10月18日（星期三）晚20:25（北京时间）

报告题目： Least Squares Generative Adversarial Networks

主持人： 刘日升（大连理工大学）

报告摘要：

Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson χ^2 divergence. There are two benefits of LSGANs over regular GANs. Firstly, LSGANs are able to generate higher quality images than regular GANs. Secondly, LSGANs perform more stable during the learning process. We evaluate LSGANs on LSUN and CIFAR-10 datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.

参考文献：

[1] Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley, "Least Squares Generative Adversarial Networks", in IEEE International Conference on Computer Vision, Venice, Italy, 2017.

报告人简介：

Xudong Mao received his BEng degree from Nankai University in 2011 and MPhil degree from City University of Hong Kong in 2014. He is currently a PhD student at City University of Hong Kong, advised by Prof. Qing Li. During 2014-2016, he worked as a senior algorithm engineer at Institute of Data Science and Technology (iDST) of Alibaba. His research interests are in the areas of computer vision and deep learning, especially the generative adversarial networks and unsupervised learning.

报告嘉宾3： 魏玮（Xi’an Jiaotong University）

报告时间： 2017年10月18日（星期三）晚20:50（北京时间）

报告题目： Should We Encode Rain Streaks in Video as Deterministic or Stochastic?

主持人： 刘日升（大连理工大学）

报告摘要：

Videos taken in the wild sometimes contain unexpected rain streaks, which brings difficulty in subsequent video processing tasks. Rain streak removal in a video (RSRV) is thus an important issue and has been attracting much attention in computer vision. Different from previous RSRV methods formulating rain streaks as a deterministic message, this work first encodes the rains in a stochastic manner, i.e., a patch-based mixture of Gaussians. Such modification makes the proposed model capable of finely adapting a wider range of rain variations instead of certain types of rain configurations as traditional. By integrating with the spatiotemporal smoothness configuration of moving objects and low-rank structure of background scene, we propose a concise model for RSRV, containing one likelihood term imposed on the rain streak layer and two prior terms on the moving object and background scene layers of the video. Experiments implemented on videos with synthetic and real rains verify the superiority of the proposed method, as compared with the state-of-the-art methods, both visually and quantitatively in various performance metrics.

参考文献：

[1] Wei Wei, Lixuan Yi, Qi Xie, Qian Zhao, Deyu Meng, Zongben Xu, Should We Encode Rain Streaks in Video as Deterministic or Stochastic? ICCV, 2017.

报告人简介：

Wei Wei obtained his B.S degree from Mathematics Elite Class, School of Mathematics and Statistics, Xi’an Jiaotong University, in 2015. He is currently a master student majored in Statistics at School of Mathematics and Statistics, Xi’an Jiaotong University, supervised by Professor Zongben Xu. His research interests include computer vision and machine learning. He is working in the Machine Learning Group, especially in the area of noise modelling, leaded by Professor Deyu Meng.

报告嘉宾4： 谢江涛(Dalian University of Technology)

报告时间： 2017年10月18日（星期三）晚21:15（北京时间）

报告题目： Is Second-order Information Helpful for Large-scale Visual Recognition?

主持人： 刘日升（大连理工大学）

报告摘要：

By stacking layers of convolution and nonlinearity, convolutional networks (ConvNets) effectively learn from low-level to high-level features and discriminative representations. Since the end goal of large-scale recognition is to delineate complex boundaries of thousands of classes, adequate exploration of feature distributions is important for realizing full potentials of ConvNets. However, state-of-the-art works concentrate only on deeper or wider architecture design, while rarely exploring feature statistics higher than first-order. We take a step towards addressing this problem. Our method consists in covariance pooling, instead of the most commonly used first-order pooling, of high-level convolutional features. The main challenges involved are robust covariance estimation given a small sample of large-dimensional features and usage of the manifold structure of covariance matrices. To address these challenges, we present a Matrix Power Normalized Covariance (MPN-COV) method. We develop forward and backward propagation formulas regarding the nonlinear matrix functions such that MPN-COV can be trained end-to-end. In addition, we analyze both qualitatively and quantitatively its advantage over the well-known Log-Euclidean metric. On the ImageNet 2012 validation set, by combining MPN-COV we achieve over 4%, 3% and 2.5% gains for AlexNet, VGG-M and VGG-16, respectively; integration of MPN-COV into 50-layer ResNet outperforms ResNet-101 and is comparable to ResNet-152. The source code will be available on the project page: http://www.peihuali.org/MPN-COV.

参考文献：

[1]Peihua Li, Jiangtao Xie, Qilong Wang and Wangmeng Zuo. Is Second-order Information Helpful for Large-scale Visual Recognition? IEEE Int. Conf. on Computer Vision (ICCV), pp. 2070-2078, 2017.

[2]Qilong Wang, Peihua Li, Wangmeng Zuo, Lei Zhang. RAID-G: Robust Estimation of Approximate Infinite Dimensional Gaussian with Application to Materiel Recognition. Int. Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 4433-4441, 2016.

报告人简介：

Jiangtao Xie is a fourth-year undergraduate of the Electronic Information Innovation Experimental Class of Dalian University of Technology. As a key member of DLUT_VLG team, he achieved 5/50 in iNaturalist Challenge at Fine-Grained Visual Categorization (FGVC) 2017 in conjunction with CVPR2017. His research interests include computer vision and deep learning.

报告嘉宾5： 张平平(Dalian University of Technology)

【17-25期VALSE Webinar活动】

正文

请到「今天看啥」查看全文