Text-Guided Diffusion with Spectral Convolution for 3D Human Pose Estimation

Loading...
Thumbnail Image
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
The Eurographics Association and John Wiley & Sons Ltd.
Abstract
Although significant progress has been made in monocular video-based 3D human pose estimation, existing methods lack guidance from fine-grained high-level prior knowledge such as action semantics and camera viewpoints, leading to significant challenges for pose reconstruction accuracy under scenarios with severely missing visual features, i.e., complex occlusion situations. We identify that the 3D human pose estimation task fundamentally constitutes a canonical inverse problem, and propose a motion-semantics-based diffusion(MS-Diff) framework to address this issue by incorporating high-level motion semantics with spectral feature regularization to eliminate interference noise in complex scenes and improve estimation accuracy. Specifically, we design a Multimodal Diffusion Interaction (MDI) module that incorporates motion semantics including action categories and camera viewpoints into the diffusion process, establishing semantic-visual feature alignment through a cross-modal mechanism to resolve pose ambiguities and effectively handle occlusions. Additionally, we leverage a Spectral Convolutional Regularization (SCR) module that implements adaptive filtering in the frequency domain to selectively suppress noise components. Extensive experiments on large-scale public datasets Human3.6M and MPI-INF-3DHP demonstrate that our method achieves state-of-the-art performance.
Description

CCS Concepts: Computing methodologies → Activity recognition and understanding

        
@article{
10.1111:cgf.70263
, journal = {Computer Graphics Forum}, title = {{
Text-Guided Diffusion with Spectral Convolution for 3D Human Pose Estimation
}}, author = {
Shi, Liyuan
and
Wu, Suping
and
Yang, Sheng
and
Qiu, Weibin
and
Qiang, Dong
and
Zhao, Jiarui
}, year = {
2025
}, publisher = {
The Eurographics Association and John Wiley & Sons Ltd.
}, ISSN = {
1467-8659
}, DOI = {
10.1111/cgf.70263
} }
Citation
Collections