微软亚洲研究院的刘铁岩等人近日在AAAI 2017上做的有关优化以及大规模机器学习的Tutorial,"Recent Advances in Distributed Machine Learning",很值得一看。里面对传统的优化算法,特别是一些理论特性以及分布式算法的相应理论特性都有一个比较详尽的总结。非常适合想快速了解这些领域的学者和工程师。另外,这个Tutorial还介绍了DMTK的一些情况,作为一个分布式计算平台的优缺点,还顺带比较了Spark和TensorFlow等流行框架。
摘要:
In recent years, artificial intelligence has demonstrated its power in many important applications. Besides the novel machine learning algorithms (for example, deep neural networks), their distributed implementations play a very critical role in these successes. In this tutorial, we will first review popular machine learning models and their corresponding optimization techniques. Second, we will introduce different ways of parallelizing machine learning algorithms, that is, data parallelism, model parallelism, synchronous parallelism, asynchronous parallelism, and so on, and discuss their theoretical properties, advantages, and limitations. Third, we will discuss some recent research works that try to overcome the limitations of standard parallelization mechanisms, including advanced asynchronous parallelism and new communication and aggregation methods. Finally, we will introduce how to leverage popular distributed machine learning platforms, such as Spark MlLib, DMTK, Tensorflow, to parallelize a given machine learning algorithm, in order to give the audience some practical guidelines on this topic.