[1]
《
3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network》
(2014)
[2]《VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera》
(ACM-2017)
[3]《Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose》
(CVPR-2017)
[4]《Integral Human Pose Regression》
(CVPR-2018)
[5]《Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image》
(ICCV2019)
3D形态估计是一个更新的任务,旨在恢复人体的三维网格,我的研究课题正好与此相关。研究这个方向的同学应该并不陌生,但是这里我们只介绍有关于SMPL的内容,再次推荐之前提到的综述论文[5] Recovering 3D Human Mesh from Monocular Images: A Survey(2022.03),里面的总结也是非常全的,截止2022年论文如下图所示:
图 Human Mesh
其论文可以归类为两类型:
Optimization-based Paradigm 基于优化
: Optimization-based approaches attempt to estimate a 3D body mesh that is consistent with 2D image observations.( 2D keypoints, silhouettes, segmentations.)即根据2D检测结果优化生成3Dmseh。代表作:SMPLify(ECCV'2016).
https://smplify.is.tue.mpg.de/
Regression-based Paradigm 基于回归
: Regression-based methods take advantage of the thriving deep learning techniques to directly process pixels.即使用深度学习技术直接处理图像像素生成3Dmesh。代表作:HMR(CVPR'2018).
https://akanazawa.github.io/hmr/
[1] SMPLify(ECCV'2016):《Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image》。
https://smplify.is.tue.mpg.de/
[2]SMPLify-X (CVPR'2019):《Expressive Body Capture: 3D Hands, Face, and Body from a Single Image》
https://smpl-x.is.tue.mpg.de/
[3]HMR(CVPR'2018) :《End-to-end Recovery of Human Shape and Pose》
https://link.zhihu.com/?target=https%3A//akanazawa.github.io/hmr/
[4] SPIN(ICCV'2019):《 Learning to Reconstruct 3D Human Pose and Shapevia Model-fitting in the Loop 》
https://www.seas.upenn.edu/~nkolot/projects/spin/
[5] VIBE(CVPR'2020):《 Video lnference for Human Body Pose and Shape Estimation》
https://github.com/mkocabas/VIBE
[6] HybrIK (CVPR'2021):《HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation》
https://jeffli.site/HybrIK/
[7] PARE (ICCV'2021):《PARE: Part Attention Regressor for 3D Human Body Estimation》
https://pare.is.tue.mpg.de/
[8]
HuMoR
(2021) :《3D Human Motion Model for Robust Pose Estimation》
[9] DeciWatch(ECCV'2022):《DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation》
https://ailingzeng.site/deciwatch
[10] SmoothNet (ECCV'2022):《SmoothNet:A Plug-and-Play Network for Refining Human Poses in Videos》
https://ailingzeng.site/smoothnet
[11] ExPose (ECCV'2020):《Monocular Expressive Body Regression through Body-Driven Attention》
https://expose.is.tue.mpg.de/
[12]BalancedMSE (CVPR'2022):《Balanced MSE for Imbalanced Visual Regression 》
https://sites.google.com/view/balanced-mse/home
对其中的部分论文作简要介绍:
SMPLify:
基于优化的方法。给定一个图像,使用基于 CNN 的方法来预测 2D 关节位置。然后将 3D 身体模型拟合到此,以估计 3D 身体形状和姿势。
https://smplify.is.tue.mpg.de/
提出了SMPL重建的损失函数(objective function),由5部分组成,包括:
a
joint-based data term and several regularization terms including
an
interpenetration error term(这个互穿项在SPIN中舍弃了,因为它使得拟合变慢,而且性能并没有提高多少),
two
pose priors, and
a
shape prior.后续的方法基本都使用该损失函数或对其进行改进。
图 SMPLify
HMR:
基于回归的方法。图像 I 通过卷积编码器传递。输入到迭代 3D 回归模块,该模块推断人类的潜在 3D 表示,以最小化联合重投影误差。3D 参数也被发送到鉴别器 D,其目标是判断这些参数是来自真实的人类形状和姿势。这里用到了GAN对抗生成网络。
https://akanazawa.github.io/hmr/
图 HMR
SPIN:
基于优化+回归的方法。这篇文章主要是将基于迭代优化的方法(SMPLify)和基于网络回归的方法(HMR)进行结合。网络预测的结果作为优化方法的初始值,加快迭代优化的速度和准确性;迭代优化的结果可以作为网络的一个强先验。两种方法相互辅助,使整个方法有一种自我提升的能力,称之为SPIN(SPML oPtimization IN the loop)。
https://www.seas.upenn.edu/~nkolot/projects/spin/