专栏名称: 有三AI
三人行必有AI,本公众号聚焦于让大家能够系统性地完成AI各个领域所需的专业知识的学习
目录
相关文章推荐
政法频道  ·  寻找心空间 | ... ·  20 小时前  
政法频道  ·  寻找心空间 | ... ·  20 小时前  
武志红  ·  我们为什么躺不平? ·  2 天前  
简单心理  ·  纯好奇,做心理咨询是什么感觉? ·  3 天前  
51好读  ›  专栏  ›  有三AI

【通知】《深度学习之模型优化》代码和数据已在Github开源,参考文献请根据本文获取!

有三AI  · 公众号  ·  · 2024-07-10 18:04

正文

有三的新书 《深度学习之模型优化:核心算法与案例实践》 已经正式上市, 本次书籍为我写作并出版的第7本书籍, 大家可以在我们的知识平台或者京东/当当等店铺进行购买,本书配套的代码与数据资源也已经正式上传我们的官方GitHub项目,请大家知悉。


开源代码项目


有三AI社区维护了一个汇总了我们的 免费实践案例,纸质版书籍资源、电子版开源学习手册 的综合性 GitHub项目,地址为:

https://github.com/longpeng2008/yousan.ai

新书《深度学习之模型优化:核心算法与案例实践》的相关资料已经上传,资源 预览界面如下,请购买了书籍的朋友及时获取!


本书内容


本书是深度学习模型使用系列书籍中的第二本,内容上承前启后。本书是在该系列第一本书《深度学习之模型设计》的基础上讲解更深入的模型设计与压缩方法。


本书的第4章讲解了轻量级模型设计方法,本书的第8章讲解了自动化模型设计方法,它们都可以看作是对《深度学习之模型设计》书籍内容的补充。而剩下的模型剪枝,模型量化,模型蒸馏,则是模型压缩与优化最核心的技术。除此之外,我们也补充了模型可视化的内容,以便读者增加对模型的理解。


而在本书的第9章中,我们简单介绍了一些常用的开源模型优化和部署工具,这也是为下一本书,即模型部署书籍进行了提前的铺垫。


全书正文约230页,共计9章,目录如下:



第1章  引言

本章对人工智能技术发展的重要要素,数据、模型、框架、硬件进行了介绍,充足的数据配合优秀的模型才能学习到复杂的知识,框架和硬件则是完成模型学习不可或缺的软硬件设施,希望读者能够在阅读本章内容后,充分认识到人工智能本质上是一门综合性的工程技术。



第2章  模型性能评估

本章介绍了常用的模型性能评估指标,包括参数量、计算量、内存访问、计算速度等,最后介绍了工业界的一个模型压缩相关竞赛。



第3章  模型可视化

本章系统性地介绍了模型可视化的内容,包括模型结构可视化、参数与特征可视化、输入区域可视化以及激活模式可视化,通过掌握相关原理和3个典型的实践案例,我们可以更深入地理解模型的性能表现以及参数细节,从而为设计和改进模型结构提供指导思想。



第4章  轻量级模型设计

本章系统性地介绍了当下轻量级模型设计的方法,包括卷积核的使用和设计、卷积拆分与分组设计、参数与特征重用设计、动态自适应模型设计、卷积乘法操作的优化和设计、重参数化技术、新颖算子设计、低秩稀疏化技术。通过在一开始就使用轻量级的基础模型架构,可以大大减少后续对其进一步进行模型压缩与优化的工作量,因此这也是本书中非常核心的内容。



第5章  模型剪枝

本章介绍了模型剪枝的主要算法理论与实践,主要包括模型稀疏正则化技术,非结构化模型剪枝与结构化模型剪枝等算法,最后通过案例实践让读者掌握结构化模型剪枝中原始模型的训练与训练后的稀疏裁剪。



第6章  模型量化

本章介绍了模型量化的主要算法理论与实践,主要包括1bit量化,对称与非对称的8bit量化,混合量化等算法,最后通过案例实践让读者掌握对称的8bit量化方法代码实现以及基于TensorRT框架的模型量化与推理流程。



第7章  迁移学习与知识蒸馏

本章介绍了模型蒸馏的主要算法理论与实践,主要包括基于优化目标与结构匹配的模型蒸馏算法,最后通过案例实践让读者掌握经典的知识蒸馏框架的模型训练,比较学生模型在蒸馏前后的性能变化。



第8章  自动化模型设计

本章介绍了自动化模型设计中神经网络结构搜索技术,主要包括基于栅格搜索的神经网络搜索方法,基于强化学习的神经网络搜索方法,基于进化算法的神经网络搜索方法,可微分神经网络搜索方法。自动化模型设计是难度较高的工程技术,也是模型设计与压缩的最终发展形态。



第9章 模型优化与部署工具

本章介绍了当下工业界常用的开源模型优化和部署工具,主要包括Tensorflow、PaddlePaddle、Pytorch生态相关的模型优化工具,各类通用的移动端模型推理框架以及ONNX标准与NVIDIA的模型优化与部署工具TensorRT,并基于NCNN框架在嵌入式硬件上进行了部署实战。熟练掌握好模型优化与部署工具,是深度学习算法工程师的必修课,本章内容可供大家作为入门参考,更加系统的模型部署内容,将在本系列书籍的下一本中进行讲解。


详细内容请大家直接阅读书籍。 本书内容由浅入深,讲解图文并茂,紧随工业界和学术界的最新发展,理论和实践紧密结合,给出了大量的图表与案例分析。本书抛开了过多的数学理论,完整地剖析了模型压缩与优化的主流技术,不是只停留于理论的阐述和简单的结果展示,更是从夯实理论到完成实战一气呵成。相信读者跟随着本书进行学习,将会对深度学习领域的模型压缩技术有更深的理解。 本书是一本专门讲解深度学习模型压缩与优化,尤其是针对深度卷积神经网络的书籍,本书内容属于深度学习领域中高级内容,对读者的基础有一定的要求,建议预先学习CNN模型设计基础知识。


【本书所有实战算法代码统一使用Pytorch框架,TensorRT+Jetson开发版推理代码使用Python语言,NCNN+EAIDK-610开发版部署代码使用C++语言】


本书参考资料


由于出版社管控原因,本书所有的参考资料出处都被删除,因此我们在这里列出所有的参考资料列表供大家索引。


第三章,模型可视化参考资料如下:

[1] Erhan D, Bengio Y, Courville A, et al. Visualizing higher-layer features of a deep network[J]. University of Montreal, 2009, 1341(3): 1.

[2] Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: Visualising image classification models and saliency maps[J]. arXiv preprint arXiv:1312.6034, 2013.

[3] Yosinski J, Clune J, Nguyen A, et al. Understanding neural networks through deep visualization[J]. arXiv preprint arXiv:1506.06579, 2015.

[4] Mordvintsev A, Olah C, Tyka M. Inceptionism: Going deeper into neural networks[J]. 2015.

[5] Nguyen A, Dosovitskiy A, Yosinski J, et al. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks[J]. Advances in neural information processing systems, 2016, 29.

[6] Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C]//European conference on computer vision. Springer, Cham, 2014: 818-833.

[7] Mahendran A, Vedaldi A. Understanding deep image representations by inverting them[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 5188-5196.

[8] Mahendran A, Vedaldi A. Visualizing deep convolutional neural networks using natural pre-images[J]. International Journal of Computer Vision, 2016, 120(3): 233-255.

[9]Dosovitskiy A, Brox T. Inverting visual representations with convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 4829-4837.

[10] Wei, Donglai , et al. "Understanding Intra-Class Knowledge Inside CNN.", 10.48550/arXiv.1507.02379. 2015.

[11] Zhou B ,  Khosla A ,  Lapedriza A , et al. Object detectors emerge in Deep Scene CNNs[J]. Computer Science, 2014.

[12] Bau D ,  Zhou B ,  Khosla A , et al. Network Dissection: Quantifying Interpretability of Deep Visual Representations[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.

[13] Smilkov D, Thorat N, Kim B, et al. Smoothgrad: removing noise by adding noise[J]. arXiv preprint arXiv:1706.03825, 2017.

[14] Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks[C]//International conference on machine learning. PMLR, 2017: 3319-3328.

[15] Springenberg J T, Dosovitskiy A, Brox T, et al. Striving for simplicity: The all convolutional net[J]. arXiv preprint arXiv:1412.6806, 2014.

[16] Zhou B ,  Khosla A ,  Lapedriza A , et al. Learning Deep Features for Discriminative Localization[C]// CVPR. IEEE Computer Society, 2016.

[17] Selvaraju R R ,  Cogswell M ,  Das A , et al. Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization[J]. International Journal of Computer Vision, 2020, 128(2):336-359.

[18] Wang Z J, Turko R, Shaikh O, et al. CNN explainer: learning convolutional neural networks with interactive visualization[J]. IEEE Transactions on Visualization and Computer Graphics, 2020, 27(2): 1396-1406.

[19] Liu M, Shi J, Li Z, et al. Towards better analysis of deep convolutional neural networks[J]. IEEE transactions on visualization and computer graphics, 2016, 23(1): 91-100.

第四章,轻量级模型设计参考资料如下:

[1] L in M, Chen Q, Yan S. Network in network[J]. arXiv preprint arXiv:1312.4400, 2013.

[2] Zeiler M D, Fergus R. Visualizing and understanding convolutional networks[C]//European conference on computer vision. Springer, Cham, 2014: 818-833.

[3] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size[J]. arXiv preprint arXiv:1602.07360, 2016.

[4] Jin J, Dundar A, Culu rciello E. Flattened convolutional neural networks for feedforward acceleration[J]. arXiv preprint arXiv:1412.5474, 2014.

[5] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2818-2826.

[6] Sifre L , Mallat, Stéphane. Rigid-Motion Scattering for Texture Classification[J]. Computer Science, 2014.

[7] Chollet F. Xception: Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.

[8] Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017.

[9]Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520.

[10] Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 6848-6856.

[11] Ma N, Zhang X, Zheng H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 116-131.

[12] Zhang T, Qi G J, Xiao B, et al. Interleaved group convolutions[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 4373-4382.

[13] Xie G, Wang J, Zhang T, et al. Interleaved structured sparse convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 8847-8856.

[14] Sun K, Li M, Liu D, et al. Igcv3: Interleaved low-rank group convolutions for efficient deep neural networks[J]. arXiv preprint arXiv:1806.00178, 2018.

[15] Tan M, Le Q V. MixNet: Mixed Depthwise Convolutional Kernels[J]. arXiv preprint arXiv:1907.09595, 2019.

[16] Chen C F, Fan Q, Mallinar N, et al. Big-little net: An efficient multi-scale feature representation for visual and speech recognition[J]. arXiv preprint arXiv:1807.03848, 2018.

[17] Chen Y, Fang H, Xu B, et al. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution[J]. arXiv preprint arXiv:1904.05049, 2019.

[18] Gennari M, Fawcett R, Prisacariu V A. DSConv: Efficient Convolution Operator[J]. arXiv preprint arXiv:1901.01928, 2019.

[19 ] Shang W, Sohn K, Almeida D, et al. Understand ing and improving convolutional neural networks via concatenated rectified linear units[C]//international conference on machine learning. PMLR, 2016: 2217-2225.

[20] Han K, Wang Y, Tian Q, et al. GhostNet: More Features from Cheap Operations.[J]. arXiv: Computer Vision and Pattern Recognition, 2019.

[21] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700-4708.

[22] Jin X, Yang Y, Xu N, et al. Wsnet: Compact and efficient networks through weight sampling[C]//International Conference on Machine Learning. PMLR, 2018: 2352-2361.

[23] Zhou D, Jin X, Wang K, et al. Deep model compression via filter auto-sampling[J]. arXiv preprint, 2019.

[24] Huang G, Sun Y, Liu Z, e t al. Deep networks with stochastic depth[C]//European conference on computer vision. Springer, Cham, 2016: 646-661.

[25] Veit A, Wilber M J, Belongie S. Residual networks behave like ensembles of relatively shallow networks[C]//Advances in neural information processing systems. 2016: 550-558.

[26] Teerapittayanon S, McDanel B, Kung H T. Branchynet: Fast inference via early exiting from deep neural networks[C]//2016 23rd International Conference on Pattern Recognition (ICPR). IEEE, 2016: 2464-2469.

[27] Graves A. Adaptive computation time for recurrent neural networks[J]. arXiv preprint arXiv:1603.08983, 2016.

[28] Figurnov M, Collins M D, Zhu Y, et al. Spatially adaptive computation time for residual networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 1039-1048.

[29] Wu Z, Nagarajan T, Kumar A, et al. Blockdrop: Dynamic inference paths in residual networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 8817-8826.

[30] Wang X, Yu F, Dou Z Y, et al. Skipnet: Learning dynamic routing in convolutional networks[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 409-424.

[31] Almahairi A, Ballas N, Cooijmans T, et al. Dynamic capacity networks[C]//International Conference on Machine Learning. PMLR, 2016: 2549-2558.

[32] Wu B, Wan A, Yue X, et al. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions[C]. computer vision and pattern recognition, 2018: 9127-9135.

[33] Chen W, Xie D, Zhang Y, et al. All you need is a few shifts: Designing efficient convolutional neural networks for image classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 7241-7250.

[34] He Y, Liu X, Zhong H, et al. Addressnet: Shift-based primitives for efficient convolutional neural networks[C]//2019 IEEE Winter conference on applications of computer vision (WACV). IEEE, 2019: 1213-1222.

[35] Chen H, Wang Y, Xu C, et al. AdderNet: Do we really need multiplications in deep learning?[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 1468-1477.

[36] You H, Chen X, Zhang Y, et al. Shiftaddnet: A hardware-inspired deep network[J]. Advances in Neural Information Processing Systems, 2020, 33: 2771-2783.

[ 37] L i D, Wang X, Kong D. Deeprebirth: Accelerating deep neural network execution on mobile devices[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1).

[38] Ding X, Zhang X, Ma N, et al. Repvgg: Making vgg-style convnets great again[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021: 13733-13742.

[39] Li D, Hu J, Wang C, et al. Involution: Inverting the inherence of convolution for visual recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12321-12330.

[40] Denton E L, Zaremba W, Bruna J, et al. Exploiting linear structure within convolutional networks for efficient evaluation[C]//Advances in Neural Information Processing Systems. 2014: 1269-1277.

第五章,模型剪枝参考资料如下:

[1] Wen W , Wu C , Wang Y , et al. Learning Structured Sparsity in Deep Neural Networks[J]. 2016.

[2] Liu Z, Li J, Shen Z, et al. Learning efficient convolutional networks through network slimming[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2736-2744.

[3] Luo J, Wu J. AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference[J]. arXiv: Computer Vision and Pattern Recognition, 2018.

[4] Huang Z, Wang N. Data-driven sparse structure selection for deep neural networks[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 304-320.

[5] LeCun Y, Denker J S, Solla S A. Optimal brain damage[C]//Advances in neural information processing systems. 1990: 598-605.

[6] Lee N, Ajanthan T, Torr P H S. Snip: Single-shot network pruning based on connection sensitivity[J]. arXiv preprint arXiv:1810.02340, 2018.

[7] Han S, Pool J, Tran J, et al. Learning both weights and connections for efficient neural network[C]//Advances in neural information processing systems. 2015: 1135-1143.

[8] Guo Y, Yao A, Chen Y. Dynamic network surgery for efficient dnns[C]//Advances In Neural Information Processing Systems. 2016: 1379-1387.

[9] Anwar S , Hwang K , Sung W . Structured Pruning of Deep Convolutional Neural Networks[J]. Acm Journal on Emerging Technologies in Computing Systems, 2015.

[10] Li H, Kada v A, Durdanovic I, et al. Pruning Filters for Efficient ConvNets[J]. arXiv: Computer Vision and Pattern Recognition, 2016.

[11] Hu H, Peng R, Tai Y W, et al. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures[J]. arXiv preprint arXiv:1607.03250, 2016.

[12] He Y, Liu P, Wang Z, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 4340-4349.

[13] He Y, Zhang X, Sun J. Channel pruning for accelerating very deep neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 1389-1397.

[14] Luo J H, Zhang H, Zhou H Y, et al. Thinet: pruning cnn filters for a thinner net[J]. IEEE transactions on pattern analysis and machine intelligence, 2018.

[15] Molchanov P, Tyree S, Karras T, et al. Pruning Convolutional Neural Networks for Resource Efficient Inference[C]. international conference on learning representations, 2017.

[16] Zhuang Z, Tan M, Zhuang B, et al. Discrimination-aware Channel Pruning for Deep Neural Networks[C]. neural information processing systems, 2018: 883-894.

[17] Liu Z, Sun M, Zhou T, et al. Rethinking the value of network pruning[J]. arXiv preprint arXiv:1810.05270, 2018.

[18] Zhu M, Gupta S. To prune, or not to prune: exploring the efficacy of pruning for model compression[J]. arXiv: Machine Learning, 2017.

[19] Yu R, Li A, Chen C F, et al. Nisp: Pruning networks using neuron importance score propagation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 9194-9203.

[20] Lin J, Rao Y, Lu J, et al. Runtime Neural Pruning[C]. neural information processing systems, 2017: 2181-2191.

[21] Ye J, Lu X, Lin Z, et al. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers[J]. arXiv preprint arXiv:1802.00124, 2018.

[22] Lee N, Ajanthan T, Torr P H, et al. SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY[C]. international conference on learning representations, 2019.

第六章,模型量化参考资料如下:

[1] Courbariaux M, Bengio Y, David J, et al. BinaryConnect: training deep neural networks with binary weights during propagations[C]. neural information processing systems, 2015: 3123-3131.

[2] Courbariaux M, Hubara I, Soudry D, et al. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1[J]. arXiv preprint arXiv:1602.02830, 2016.

[3] Liu Z, Wu B, Luo W, et al. Bi-Real Net: Enhancing the Performance of 1-Bit CNNs with Improved Representational Capability and Advanced Training Algorithm[C]. european conference on computer vision, 2018: 747-763.

[4] Rastegari M, Ordonez V, Redmon J, et al. Xnor-net: Imagenet classification using binary convolutional neural networks[C]//European conference on computer vision. Springer, Cham, 2016: 525-542.

[5] Bulat A, Tzimiropoulos G. Xnor-net++: Improved binary neural networks[J]. arXiv preprint arXiv:1909.13863, 2019.

[6] Li F, Zhang B, Liu B. Ternary weight networks[J]. arXiv preprint arXiv:1605.04711, 2016.

[7] Zhu C, Han S, Mao H, et al. Trained ternary quantization[J]. arXiv preprint arXiv:1612.01064, 2016.

[8] Ding R, Chin T, Liu Z, et al. Regularizing Activation Distribution for Training Binarized Deep Networks[C]. computer vision and pattern recognition, 2019: 11408-11417.

[9] Darabi S, Belbahri M, Courbariaux M, et al. Regularized binary network training[J]. arXiv preprint arXiv:1812.11800, 2018.

[10] Bulat A, Tzimiropoulos G, Kossaifi J, et al. Improved training of binary networks for human pose estimation and image recognition[J]. arXiv preprint arXiv:1904.05868, 2019.

[11] Martinez B, Yang J, Bulat A, et al. Training binary neural networks with real-to-binary convolutions[J]. arXiv preprint arXiv:2003.11535, 2020.

[12] Liu Z, Shen Z, Savvides M, et al. Reactnet: Towards precise binary neural network with generalized activation functions[C]//Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16. Springer International Publishing, 2020: 143-159.

[13] Zhang Y, Pan J, Liu X, et al. FracBNN: Accurate and FPGA-efficient binary neural networks with fractional activations[C]//The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 2021: 171-182.

[14] Zhang Y, Zhang Z, Lew L. Pokebnn: A binary pursuit of lightweight accuracy[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 12475-12485.

[15] Guo N, Bethge J, Meinel C, et al. Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket[J]. arXiv preprint arXiv:2211.12933, 2022.

[16] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2704-2713.

[17] Hubara I, Courbariaux M, Soudry D, et al. Quantized neural networks: Training neural networks with low precision weights and activations[J]. The Journal of Machine Learning Research, 2017, 18(1): 6869-6898.

[18] Zhou S, Wu Y, Ni Z, et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients[J]. arXiv: Neural and Evolutionary Computing, 2016.

[19] Wang K, Liu Z, Lin Y, et al. HAQ: Hardware-Aware Automated Quantization with Mixed Precision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 8612-8620.

[20] Micikevicius P, Narang S, Alben J, et al. Mixed precision training[J]. arXiv preprint arXiv:1710.03740, 2017.

[21] Han S, Mao H, Dally W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding[J]. arXiv preprint arXiv:1510.00149, 2015.

[22] Zhang D, Yang J, Ye D, et al. LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks[C]. european conference on computer vision, 2018: 373-390.

[23] Choi J, Wang Z, Venkataramani S, et al. Pact: Parameterized clipping activation for quantized neural networks[J]. arXiv preprint arXiv:1805.06085, 2018.

[24] Zhou A, Yao A, Guo Y, et al. Incremental network quantization: Towards lossless cnns with low-precision weights[J]. arXiv preprint arXiv:1702.03044, 2017.

[25] Zhu F, Gong R, Yu F, et al. Towards Unified INT8 Training for Convolutional Neural Network.[J]. arXiv: Learning, 2019.

第七章,模型蒸馏参考资料如下:

[1] Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network[J]. arXiv preprint arXiv:1503.02531, 2015, 2(7).

[2] Xu Z, Hsu Y, Huang J, et al. Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks.[J]. arXiv: Learning, 2017.

[3] Ravi S. Projectionnet: Learning efficient on-device deep networks using neural projections[J]. arXiv preprint arXiv:1708.00630, 2017.

[4] Romero A, Ballas N, Kahou S E, et al. Fitnets: Hints for thin deep nets[J]. arXiv preprint arXiv:1412.6550, 2014.

[5] Huang Z, Wang N. Like What You Like: Knowledge Distill via Neuron Selectivity Transfer.[J]. arXiv: Computer Vision and Pattern Recognition, 2017.

[6] Zagoruyko S, Komodakis N. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer[C]. international conference on learning representations, 2017.

[7] Yim J, Joo D, Bae J, et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning[C]. computer vision and pattern recognition, 2017: 7130-7138.

[8] Zhang Y, Xiang T, Hospedales T M, et al. Deep mutual learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 4320-4328.

[9] Zhang L, Song J, Gao A, et al. Be your own teacher: Improve the performance of convolutional neural networks via self distillation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 3713-3722.

[10] Furlanello T, Lipton Z C, Tschannen M, et al. Born Again Neural Networks[C]. international conference on machine learning, 2018: 1602-1611.

[11] Cho J H, Hariharan B. On the efficacy of knowledge distillation[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 4794-4802.

[12] Yuan L, Tay F E, Li G, et al. Revisit Knowledge Distillation: a Teacher-free Framework.[J]. arXiv: Computer Vision and Pattern Recognition, 2019.

第八章,自动化模型设计参考资料如下:

[1] Cubuk E D, Zoph B, Mane D, et al. A utoAugment: Learning Augmentation Policies from Data.[J]. arXiv: Computer Vision and Pattern Recognition, 2018.

[2] Zoph B, Cubuk E D, Ghiasi G, et al. Learning Data Augmentation Strategies for Object Detection[J]. arXiv preprint arXiv:1906.11172, 2019.

[3] Ho D, Liang E, Stoica I, et al. Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules[J]. arXiv preprint arXiv:1905.05393, 2019.

[ 4] Eger S, Youssef P, Gurevych I. Is it time to swish? comparing deep learning activation functions across NLP tasks[J]. arXiv preprint arXiv:1901.02671, 2019.

[5] Luo P, Ren J, Peng Z, et al. Differentiable learning-to-normalize via switchable normalization[J]. arXiv preprint arXiv:1806.10779, 2018.

[6] Bello I, Zoph B, Vasudevan V, et al. Neural optimizer search with reinforcement learning[C]//International Conference on Machine Learning. PMLR, 2017: 459-468.

[7] Tan M , Le Q V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks[C]. international conference on machine learning, 2019: 6105-6114.

[8] Tan M, Le Q V. MixNet: Mixed Depthwise Convolutional Kernels[J]. arXiv preprint arXiv:1907.09595, 2019.

[9] Zoph B, Le Q V. Neural Archit ecture Search with Reinforcement Learning[J]. international conference on learning representations, 2017.

[10] Zoph B, Vasudevan V, Shlens J, et al. Learning Transferable Architectures for Scalable Image Recognition[J]. computer vision and pattern recognition, 2018: 8697-8710.

[11] Tan M, Chen B, Pang R, et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile[J]. arXiv: Computer Vision and Pattern Recognition, 2018.

[1 2] Xie L, Yuille A. Genetic cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 1379-1388.

[13] Real E ,  Aggarwal A ,  Huang Y , et al. Regularized Evolution for Image Classifier Architecture Search[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2018, 33.

[14] Liu H, Simonyan K, Yang Y, et al. DARTS: Differentiable Architecture Search[J]. arXiv: Learning, 2018.

[15] Cai H, Zhu L, Han S. Proxylessnas: Direct neural architecture search on target task and hardware[J]. arXiv preprint arXiv:1812.00332, 2018.

[16] He Y, Lin J, Liu Z, et al. Amc: Automl for model compression and acceleration on mobile devices[C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 784-800.

[17] Wang K, Liu Z, Lin Y, et al. HAQ: Hardware-Aware Automated Quantization with Mixed Precision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 8612-8620.

[18] Ashok A, Rhinehart N, Beainy F, et al. N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning[J]. arXiv: Learning, 2017.

[19] Pham H, Guan M, Zoph B, et al. Efficient neural architecture search via parameters sharing[C]//International conference on machine learning. PMLR, 2018: 4095-4104.

更多参考资料,请大家在学习过程中自行查找索引。


限量签名版购买


如果你想要签名版书籍,可以直接在我们课程平台下单购买, 【专属签名版书籍】如下。



购买地址如下,购买完成后 凭订单记录联系有三本人微信Longlongtogo】 ,发送收货地址。 (发票可开)



另外,即日起参加 有三AI-CV中阶-模型算法组







请到「今天看啥」查看全文