这学期在UW上深度学习系统课程,介绍深度学习系统的原理。有兴趣的同学可以做一做作业。
课程简介:
Over the past few years, deep learning has become an important technique to successfully solve problems in many different fields, such as vision, NLP, robotics. An important ingredient that is driving this success is the development of deep learning systems that efficiently support the task of learning and inference of complicated models using many devices and possibly using distributed resources. The study of how to build and optimize these deep learning systems is now an active area of research and commercialization, and yet there isn’t a course that covers this topic.
This course is designed to fill this gap. We will be covering various aspects of deep learning systems, including: basics of deep learning, programming models for expressing machine learning models, automatic differentiation, memory optimization, scheduling, distributed learning, hardware acceleration, domain specific languages, and model serving. Many of these topics intersect with existing research directions in databases, systems and networking, architecture and programming languages. The goal is to offer a comprehensive picture on how deep learning systems works, discuss and execute on possible research opportunities, and build open-source software that will have broad appeal.
We will have two classes per week. Each week will have one lecture. Another class will either be lab/discuss session or guest lectures. Each lecture will study a specific aspect of deep learning systems. The lab/discussion session will contain tutorials to implement that specific aspect and will include case studies of existing systems, such as Tensorflow, Caffe, Mxnet, PyTorch, and others.
作业一:
Assignment 1: Reverse-mode Automatic Differentiation
In this assignment, we would implement reverse-mode auto-diff.
Our code should be able to construct simple expressions, e.g. y=x1*x2+x1, and evaluate their outputs as well as their gradients (or adjoints), e.g. y, dy/dx1 and dy/dx2.
There are many ways to implement auto-diff, as explained in the slides for Lecture 4. For this assignment, we use the approach of a computation graph and an explicit construction of gradient (adjoint) nodes, similar to what MXNet and Tensorflow do.
Key concepts and data structures that we would need to implement are
Computation graph and Node
Operator, e.g. Add, MatMul, Placeholder, Oneslike
Construction of gradient nodes given forward graph
Executor
课程主页:
http://dlsys.cs.washington.edu/
作业链接:
https://github.com/dlsys-course/assignment1
原文链接:
http://weibo.com/2397265244/EDlCWpCeQ?type=comment