3D Human Shape and Pose from a Single Depth Image with Deep Dense Correspondence Enabled Model Fitting

Abstract
We propose a two-stage hybrid method, with no initialization, for 3D human shape and pose estimation from a single depth image, combining the benefits of deep learning and optimization. First, a convolutional neural network predicts pixel-wise dense semantic correspondences to a template geometry, in the form of body part segmentation labels and normalized canonical geometry vertex coordinates. Using these two outputs, pixel-to-vertex correspondences are computed in a six-dimensional embedding of the template geometry through nearest neighbor. Second, a parametric shape model (SMPL) is fitted to the depth data by minimizing vertex distances to the input. Extensive evaluation on both real and synthetic human shape in motion datasets shows that our method yields quantitatively and qualitatively satisfactory results and state-of-the-art reconstruction errors.
Description

CCS Concepts: Computing methodologies --> Motion capture; Motion processing

        
@inproceedings{
10.2312:egp.20221008
, booktitle = {
Eurographics 2022 - Posters
}, editor = {
Sauvage, Basile
and
Hasic-Telalovic, Jasminka
}, title = {{
3D Human Shape and Pose from a Single Depth Image with Deep Dense Correspondence Enabled Model Fitting
}}, author = {
Wang, Xiaofang
and
Boukhayma, Adnane
and
Prévost, Stéphanie
and
Desjardin, Eric
and
Loscos, Celine
and
Multon, Franck
}, year = {
2022
}, publisher = {
The Eurographics Association
}, ISSN = {
1017-4656
}, ISBN = {
978-3-03868-171-7
}, DOI = {
10.2312/egp.20221008
} }
Citation