专栏名称: 机器学习研究会

机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织，旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外，协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。

【推荐】视频目标分割基础

机器学习研究会 · 公众号 · AI · 2017-09-19 21:43

正文

点击上方 “机器学习研究会” 可以订阅

摘要

转自：爱可可-爱生活

This is the first in a two part series about the state of the art in algorithms for Video Object Segmentation. The first part will be an introduction to the problem and it’s “classic” solutions. We will briefly cover:

The problem, the datasets, the challenge
A new dataset that we’re announcing today!
The two main approaches from 2016: MaskTrack and OSVOS. These are the algorithms upon which all other works are based.

In the second part, which is more advanced, I will present a comparison table of all the published approaches to the DAVIS-2017 Video Object Segmentation challenge, summarize and highlight selected works and point to some emerging trends and directions.

The posts assume familiarity with some concepts in computer vision and deep learning, but are quite accessible. I hope to make a good introduction to this computer vision challenge and bring newcomers quickly up to speed.

Introduction

There are three classic tasks related to objects in computer vision: classification, detection and segmentation. While classification aims to answer the “what?”, the goal of the latter two is to also answer the “where?”, and segmentation specifically aims to do it at the pixel level.

Classical computer vision tasks (image from Stanford’s cs231n course slides)

In 2016 we have seen semantic segmentation mature and perhaps even begin to saturate existing datasets. Meanwhile, 2017 has been somewhat of a breakout year for video related tasks: action classification, action (temporal) segmentation, semantic segmentation, etc. In these posts we will focus on Video Object Segmentation.