专栏名称: 机器学习研究会

机器学习研究会是北京大学大数据与机器学习创新中心旗下的学生组织，旨在构建一个机器学习从事者交流的平台。除了及时分享领域资讯外，协会还会举办各种业界巨头/学术神牛讲座、学术大牛沙龙分享会、real data 创新竞赛等活动。

【推荐】用Python\/OpenCV实现增强现实

机器学习研究会 · 公众号 · AI · 2017-11-16 22:34

正文

点击上方 “机器学习研究会” 可以订阅

摘要

转自：爱可可-爱生活

You may (or may not) have heard of or seen the augmented reality Invizimals video game or theTopps 3D baseball cards. The main idea is to render in the screen of a tablet, PC or smartphone a 3D model of a specific figure on top of a card according to the position and orientation of the card.

Figure 1: Invizimal augmented reality cards. Source:

Well, this past semester I took a course in Computer Vision where we studied some aspects of projective geometry and thought it would be an entertaining project to develop my own implementation of a card based augmented reality application. I warn you that we will need a bit of algebra to make it work but I’ll try to keep it as light as possible. To make the most out of it you should be comfortable working with different coordinate systems and transformation matrices.

First, this post does not pretend to be a tutorial, a comprehensive guide or an explanation of the Computer Vision techniques involved and I will just mention the essentials required to follow the post. However, I encourage you to dig deeper in the concepts that will appear along the way.

Secondly, do not expect some professional looking results. I did this just for fun and there are plenty of decisions I made that could have been done better. The main idea is to develop a proof of concept application.

With that said, here it goes my take on it.

Where do we start?

Looking at the project as a whole may make it seem more difficult than it really is. Luckily for us, we will be able to divide it into smaller parts that, when combined one on top of another, will allow us to have our augmented reality application working. The question now is, which are these smaller chunks that we need?

Let’s take a closer look into what we want to achieve. As stated before, we want to project in a screen a 3D model of a figure whose position and orientation matches the position and orientation of some predefined flat surface. Furthermore, we want to do it in real time, so that if the surface changes its position or orientation the projected model does so accordingly.

To achieve this we first have to be able to identify the flat surface of reference in an image or video frame. Once identified, we can easily determine the transformation from the reference surface image (2D) to the target image (2D). This transformation is called homography. However, if what we want is to project a 3D model placed on top of the reference surface to the target image we need to extend the previous transformation to handle cases were the height of the point to project in the reference surface coordinate system is different than zero. This can be achieved with a bit of algebra. Finally, we should apply this transformation to our 3D model and draw it on the screen. Bearing the previous points in mind our project can be divided into:

【推荐】用Python\/OpenCV实现增强现实

正文

Where do we start?

请到「今天看啥」查看全文