PG2024 Conference Papers and Posters

Permanent URI for this collection

Pacific Graphics 2024 - Conference Papers and Posters
Huangshan (Yellow Mountain), China || October 13 – 16, 2024

(for Full Papers (CGF) see PG 2024 - CGF 43-7)
3D Reconstruction and Novel View Synthesis I
MGS-SLAM: Dense RGB-D SLAM via Multi-level Gaussian Splatting
Xu Wang, Ying Liu, Xiaojun Chen, Jialin Wu, Xiaohao Zhang, and Ruihui Li
Point Cloud Processing and Analysis I
Semantics-Augmented Quantization-Aware Training for Point Cloud Classification
Liming Huang, Yunchuan Qin, Ruihui Li, Fan Wu, and Kenli Li
PointJEM: Self-supervised Point Cloud Understanding for Reducing Feature Redundancy via Joint Entropy Maximization
Xin Cao, Huan Xia, Haoyu Wang, Linzhi Su, Ping Zhou, and Kang Li
Label Name is Mantra: Unifying Point Cloud Segmentation across Heterogeneous Datasets
Yixun Liang, Hao He, Shishi Xiao, Hao Lu, and Yingcong Chen
Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration
Junjie Gao, Chongjian Wang, Zhongjun Ding, Shuangmin Chen, Shiqing Xin, Changhe Tu, and Wenping Wang
PVP-SSD: Point-Voxel Fusion with Partitioned Point Cloud Sampling for Anchor-Free Single-Stage Small 3D Object Detection
Xinlin Wu, Yibin Tian, Yin Pan, Zhiyuan Zhang, Xuesong Wu, Ruisheng Wang, and Zhi Zeng
Image and Video Enhancement I
StegaVideo: Robust High-Resolution Video Steganography with Temporal and Edge Guidance
Kun Hu, Zixuan Hu, Qianhui Zhu, Xiaochao Wang, and Xingjun Wang
3D Reconstruction and Novel View Synthesis II
Single Image 3D Reconstruction of Creased Documents Using Shape-from-Shading with Template-Based Error Correction
Linqin Wang and Pengbo Bo
Point Cloud Processing and Analysis II
GCANet: A Geometric Consistency-driven Aggregation Network for Robust Primitive Segmentation on Point Clouds
Anyi Huang, Zikuan Li, Zhoutao Wang, Xiang Wu, and Jun Wang
Geometric Processing I
Mesh Slicing Along Isolines of Surface-Based Functions
Lei Wang, Xudong Wang, Wensong Wang, Shuangmin Chen, Shiqing Xin, Changhe Tu, and Wenping Wang
Geodesic Distance Propagation Across Open Boundaries
Shuangmin Chen, Zijia Yue, Wensong Wang, Shiqing Xin, and Changhe Tu
TPAM: Transferable Perceptual-constrained Adversarial Meshes
Tengjia Kang, Yuezun Li, Jiaran Zhou, Shiqing Xin, Junyu Dong, and Changhe Tu
Rendering and Lighting I
Biophysically-based Simulation of Sun-induced Skin Appearance Changes
Xueyan He, Minghao Huang, Ruoyu Fu, Jie Guo, Junping Yuan, Yanghai Wang, and Yanwen Guo
Data Parallel Ray Tracing of Massive Scenes based on Neural Proxy
Shunkang Xu, Xiang Xu, Yanning Xu, and Lu Wang
Human and Character Animation
Learning-based Self-Collision Avoidance in Retargeting using Body Part-specific Signed Distance Fields
Junwoo Lee, Hoimin Kim, and Taesoo Kwon
PhysHand: A Hand Simulation Model with Physiological Geometry, Physical Deformation, and Accurate Contact Handling
Mingyang Sun, Dongliang Kou, Ruisheng Yuan, Dingkang Yang, Peng Zhai, Xiao Zhao, Yang Jiang, Xiong Li, Jingchen Li, and Lihua Zhang
Audio-Driven Speech Animation with Text-Guided Expression
Sunjin Jung, Sewhan Chun, and Junyong Noh
Geometric Processing II
Continuous Representation based Internal Self-supporting Structure via Ellipsoid Hollowing for 3D Printing
Shengfa Wang, Jun Yang, Jiangbei Hu, Na Lei, Zhongxuan Luo, and Ligang Liu
Convex Hull Computation in a Grid Space: A GPU Accelerated Parallel Filtering Approach
Joms Antony, Manoj Kumar Mukundan, Mathew Thomas, and Ramanathan Muthuganapathy
Rendering and Lighting II
Real-Time Rendering of Glints in the Presence of Area Lights
Tom Kneiphof and Reinhard Klein
Inverse Rendering of Translucent Objects with Shape-Adaptive Importance Sampling
Jooeun Son, Yucheol Jung, Gyeongmin Lee, Soongjin Kim, Joo Ho Lee, and Seungyong Lee
Crowd and Scene Analysis
Dense Crowd Motion Prediction through Density and Trend Maps
Tingting Wang, Qiang Fu, Minggang Wang, Huikun Bi, Qixin Deng, and Zhigang Deng
Free-form Floor Plan Design using Differentiable Voronoi Diagram
Xuanyu Wu, Kenji Tojo, and Nobuyuki Umetani
Curve and Surface Modeling
Trigonometric Tangent Interpolating Curves
Andriamahenina Ramanantoanina and Kai Hormann
Simulation
Physics-Informed Neural Fields with Neural Implicit Surface for Fluid Reconstruction
Zheng Duan and Zhong Ren
Fast Wavelet-domain Smoke Guiding
Luan Lyu, Xiaohua Ren, Enhua Wu, and Zhi-Xin Yang
Image Processing and Filtering I
TSDN: Transport-based Stylization for Dynamic NeRF
Yuning Gong, Mingqing Song, Xiaohua Ren, Yuanjun Liao, and Yanci Zhang
LO-Gaussian: Gaussian Splatting for Low-light and Overexposure Scenes through Simulated Filter
Jingjiao You, Yuanyang Zhang, Tianchen Zhou, Yecheng Zhao, and Li Yao
Fast Approximation to Large-Kernel Edge-Preserving Filters by Recursive Reconstruction from Image Pyramids
Tianchen Xu, Jiale Yang, Yiming Qin, Bin Sheng, and Enhua Wu
P-NLOS: A Prompt-Based Method for Robust NLOS Imaging
Xiongfei Su, Tianyi Zhu, Lina Liu, and Yuanlong Zhang
Image Synthesis
A Contrastive Unified Encoding Framework for Sticker Style Editing
Zhihong Ni, Chengze Li, Hanyuan Liu, Xueting Liu, Tien-Tsin Wong, Zhenkun Wen, and Huisi Wu
DViTGAN: Training ViTGANs with Diffusion
Mengjun Tong, Hong Rao, Wenji Yang, Shengbo Chen, and Fang Zuo
Garment Modeling and Simulation
Computational Mis-Drape Detection and Rectification
Hyeon-Seung Shin and Hyeong-Seok Ko
Self-Supervised Multi-Layer Garment Animation Generation Network
Guoqing Han, Min Shi, Tianlu Mao, Xinran Wang, Dengming Zhu, and Lin Gao
Image Processing and Filtering II
SLGDiffuser : Stroke-level Guidance Diffusion Model for Complex Scene Text Editing
Xiao Le Liu, Lei Wu, Chang Shuo Wang, Pei Dong, and Xiang Xu Meng
Modeling Sketches both Semantically and Structurally for Zero-Shot Sketch-Based Image Retrieval is Better
Jiansen Jing, Yujie Liu, Mingyue Li, Qian Xiao, and Shijie Chai
3D Modeling and Editing
Editing Compact Voxel Representations on the GPU
Mathijs Molenaar and Elmar Eisemann
DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping
Zeyu Cai, Duotun Wang, Yixun Liang, Zhijing Shao, Ying-Cong Chen, Xiaohang Zhan, and Zeyu Wang
High-Quality Cage Generation Based on SDF
Hao Qiu, Wentao Liao, and Renjie Chen
Human I
GGAvatar: Dynamic Facial Geometric Adjustment for Gaussian Head Avatar
Xinyang Li, Jiaxin Wang, Yixin Xuan, Gongxin Yao, and Yu Pan
Enhancing Human Optical Flow via 3D Spectral Prior
Shiwei Mao, Mingze Sun, and Ruqi Huang
GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
Haodong Yan, Zhiming Hu, Syn Schmitt, and Andreas Bulling
GamePose: Self-Supervised 3D Human Pose Estimation from Multi-View Game Videos
Yang Zhou, Tianze Guo, Hao Xu, Xilei Wei, Lang Xu, Xiangjun Tang, Sipeng Yang, Qilong Kou, and Xiaogang Jin
Feature Separation Graph Convolutional Networks for Skeleton-Based Action Recognition
Lingyan Zhang, Wanyu Ling, Shuwen Daizhou, and Li Kuang
Neural Radiance Fields and Gaussian Splatting
High-Quality Geometry and Texture Editing of Neural Radiance Field
Soongjin Kim, Jooeun Son, Gwangjin Ju, Joo Ho Lee, and Seungyong Lee
Advanced 3D Synthesis and Stylization
3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis
Ruiqi Liu, Peng Zheng, Ye Wang, and Rui Ma
3DStyleGLIP: Part-Tailored Text-Guided 3D Neural Stylization
SeungJeh Chung, JooHyun Park, and HyeongYeop Kang
Human II
CKD-LQPOSE: Towards a Real-World Low-quality Cross-Task Distilled Pose Estimation Architecture
Tao Liu, Beiji Yao, Jun Huang, and Ya Wang
Colorectal Protrusions Detection based on Conformal Colon Flattening
Yuxue Ren, Wei Hu, Zhengbin Li, Wei Chen, and Na Lei
Posters
A Fiber Image Classification Strategy Based on Key Module Localization
Ya Tu Ji, Xiang Xue, Yang Liu, H. T. Xu, Q. D. E. J. Ren, B. Shi, N. E. Wu, M. Lu, X. X. Xu, L. Wang, L. J. Dai, M. M. Yao, and X. M. Li
Img2PatchSeqAD: Industrial Image Anomaly Detection Based on Image Patch Sequence
Yang Liu, Ya Tu Ji, Xiang Xue, H. T. Xu, Qing Dao Er Ji Ren, Bao Shi, N. E. Wu, M. Lu, Xuan Xuan Xu, H. X. Guo, L. Wang, L. J. Dai, Miao Miao Yao, and Xiao Mei Li
``Yunluo Journey'': A VR Cultural experience for the Chinese Musical Instrument
Yuqiu Wang, Wenchen Guo, Zhiting He, and Min Fan
Simulating Viscous Fluid Using Free Surface Lattice Boltzmann Method
Dakun Sun, Yang Gao, and Xueguang Xie
SPDD-YOLO for Small Object Detection in UAV Images
Xiang Xue, Ya Tu Ji, Yang Liu, H. T. Xu, Q. D. E. J. Ren, B. Shi, N. E. Wu, M. Lu, and X. F. Zhuang
Self-Supervised Multi-Layer Garment Animation Generation Network
Guoqing Han, Min Shi, Tianlu Mao, Xinran Wang, Dengming Zhu, and Lin Gao
CNCUR : A simple 2D Curve Reconstruction Algorithm based on constrained neighbours
Joms Antony, Minu Reghunath, and Ramanathan Muthuganapathy

BibTeX (PG2024 Conference Papers and Posters)
@inproceedings{
10.2312:pg.20242025,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Pacific Graphics 2024 - Conference Papers and Posters: Frontmatter}},
author = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20242025}
}
@inproceedings{
10.2312:pg.20241274,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
MGS-SLAM: Dense RGB-D SLAM via Multi-level Gaussian Splatting}},
author = {
Wang, Xu
and
Liu, Ying
and
Chen, Xiaojun
and
Wu, Jialin
and
Zhang, Xiaohao
and
Li, Ruihui
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241274}
}
@inproceedings{
10.2312:pg.20241275,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Semantics-Augmented Quantization-Aware Training for Point Cloud Classification}},
author = {
Huang, Liming
and
Qin, Yunchuan
and
Li, Ruihui
and
Wu, Fan
and
Li, Kenli
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241275}
}
@inproceedings{
10.2312:pg.20241276,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
PointJEM: Self-supervised Point Cloud Understanding for Reducing Feature Redundancy via Joint Entropy Maximization}},
author = {
Cao, Xin
and
Xia, Huan
and
Wang, Haoyu
and
Su, Linzhi
and
Zhou, Ping
and
Li, Kang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241276}
}
@inproceedings{
10.2312:pg.20241277,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Label Name is Mantra: Unifying Point Cloud Segmentation across Heterogeneous Datasets}},
author = {
Liang, Yixun
and
He, Hao
and
Xiao, Shishi
and
Lu, Hao
and
Chen, Yingcong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241277}
}
@inproceedings{
10.2312:pg.20241278,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration}},
author = {
Gao, Junjie
and
Wang, Chongjian
and
Ding, Zhongjun
and
Chen, Shuangmin
and
Xin, Shiqing
and
Tu, Changhe
and
Wang, Wenping
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241278}
}
@inproceedings{
10.2312:pg.20241279,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
PVP-SSD: Point-Voxel Fusion with Partitioned Point Cloud Sampling for Anchor-Free Single-Stage Small 3D Object Detection}},
author = {
Wu, Xinlin
and
Tian, Yibin
and
Pan, Yin
and
Zhang, Zhiyuan
and
Wu, Xuesong
and
Wang, Ruisheng
and
Zeng, Zhi
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241279}
}
@inproceedings{
10.2312:pg.20241280,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
StegaVideo: Robust High-Resolution Video Steganography with Temporal and Edge Guidance}},
author = {
Hu, Kun
and
Hu, Zixuan
and
Zhu, Qianhui
and
Wang, Xiaochao
and
Wang, Xingjun
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241280}
}
@inproceedings{
10.2312:pg.20241281,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Single Image 3D Reconstruction of Creased Documents Using Shape-from-Shading with Template-Based Error Correction}},
author = {
Wang, Linqin
and
Bo, Pengbo
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241281}
}
@inproceedings{
10.2312:pg.20241282,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
GCANet: A Geometric Consistency-driven Aggregation Network for Robust Primitive Segmentation on Point Clouds}},
author = {
Huang, Anyi
and
Li, Zikuan
and
Wang, Zhoutao
and
Wu, Xiang
and
Wang, Jun
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241282}
}
@inproceedings{
10.2312:pg.20241283,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Mesh Slicing Along Isolines of Surface-Based Functions}},
author = {
Wang, Lei
and
Wang, Xudong
and
Wang, Wensong
and
Chen, Shuangmin
and
Xin, Shiqing
and
Tu, Changhe
and
Wang, Wenping
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241283}
}
@inproceedings{
10.2312:pg.20241284,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Geodesic Distance Propagation Across Open Boundaries}},
author = {
Chen, Shuangmin
and
Yue, Zijia
and
Wang, Wensong
and
Xin, Shiqing
and
Tu, Changhe
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241284}
}
@inproceedings{
10.2312:pg.20241285,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
TPAM: Transferable Perceptual-constrained Adversarial Meshes}},
author = {
Kang, Tengjia
and
Li, Yuezun
and
Zhou, Jiaran
and
Xin, Shiqing
and
Dong, Junyu
and
Tu, Changhe
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241285}
}
@inproceedings{
10.2312:pg.20241286,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Biophysically-based Simulation of Sun-induced Skin Appearance Changes}},
author = {
He, Xueyan
and
Huang, Minghao
and
Fu, Ruoyu
and
Guo, Jie
and
Yuan, Junping
and
Wang, Yanghai
and
Guo, Yanwen
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241286}
}
@inproceedings{
10.2312:pg.20241287,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Data Parallel Ray Tracing of Massive Scenes based on Neural Proxy}},
author = {
Xu, Shunkang
and
Xu, Xiang
and
Xu, Yanning
and
Wang, Lu
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241287}
}
@inproceedings{
10.2312:pg.20241288,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Learning-based Self-Collision Avoidance in Retargeting using Body Part-specific Signed Distance Fields}},
author = {
Lee, Junwoo
and
Kim, Hoimin
and
Kwon, Taesoo
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241288}
}
@inproceedings{
10.2312:pg.20241289,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
PhysHand: A Hand Simulation Model with Physiological Geometry, Physical Deformation, and Accurate Contact Handling}},
author = {
Sun, Mingyang
and
Kou, Dongliang
and
Yuan, Ruisheng
and
Yang, Dingkang
and
Zhai, Peng
and
Zhao, Xiao
and
Jiang, Yang
and
Li, Xiong
and
Li, Jingchen
and
Zhang, Lihua
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241289}
}
@inproceedings{
10.2312:pg.20241290,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Audio-Driven Speech Animation with Text-Guided Expression}},
author = {
Jung, Sunjin
and
Chun, Sewhan
and
Noh, Junyong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241290}
}
@inproceedings{
10.2312:pg.20241291,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Continuous Representation based Internal Self-supporting Structure via Ellipsoid Hollowing for 3D Printing}},
author = {
Wang, Shengfa
and
Yang, Jun
and
Hu, Jiangbei
and
Lei, Na
and
Luo, Zhongxuan
and
Liu, Ligang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241291}
}
@inproceedings{
10.2312:pg.20241292,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Convex Hull Computation in a Grid Space: A GPU Accelerated Parallel Filtering Approach}},
author = {
Antony, Joms
and
Mukundan, Manoj Kumar
and
Thomas, Mathew
and
Muthuganapathy, Ramanathan
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241292}
}
@inproceedings{
10.2312:pg.20241293,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Real-Time Rendering of Glints in the Presence of Area Lights}},
author = {
Kneiphof, Tom
and
Klein, Reinhard
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241293}
}
@inproceedings{
10.2312:pg.20241294,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Inverse Rendering of Translucent Objects with Shape-Adaptive Importance Sampling}},
author = {
Son, Jooeun
and
Jung, Yucheol
and
Lee, Gyeongmin
and
Kim, Soongjin
and
Lee, Joo Ho
and
Lee, Seungyong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241294}
}
@inproceedings{
10.2312:pg.20241295,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Dense Crowd Motion Prediction through Density and Trend Maps}},
author = {
Wang, Tingting
and
Fu, Qiang
and
Wang, Minggang
and
Bi, Huikun
and
Deng, Qixin
and
Deng, Zhigang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241295}
}
@inproceedings{
10.2312:pg.20241296,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Free-form Floor Plan Design using Differentiable Voronoi Diagram}},
author = {
Wu, Xuanyu
and
Tojo, Kenji
and
Umetani, Nobuyuki
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241296}
}
@inproceedings{
10.2312:pg.20241297,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Trigonometric Tangent Interpolating Curves}},
author = {
Ramanantoanina, Andriamahenina
and
Hormann, Kai
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241297}
}
@inproceedings{
10.2312:pg.20241298,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Physics-Informed Neural Fields with Neural Implicit Surface for Fluid Reconstruction}},
author = {
Duan, Zheng
and
Ren, Zhong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241298}
}
@inproceedings{
10.2312:pg.20241299,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Fast Wavelet-domain Smoke Guiding}},
author = {
Lyu, Luan
and
Ren, Xiaohua
and
Wu, Enhua
and
Yang, Zhi-Xin
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241299}
}
@inproceedings{
10.2312:pg.20241300,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
TSDN: Transport-based Stylization for Dynamic NeRF}},
author = {
Gong, Yuning
and
Song, Mingqing
and
Ren, Xiaohua
and
Liao, Yuanjun
and
Zhang, Yanci
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241300}
}
@inproceedings{
10.2312:pg.20241301,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
LO-Gaussian: Gaussian Splatting for Low-light and Overexposure Scenes through Simulated Filter}},
author = {
You, Jingjiao
and
Zhang, Yuanyang
and
Zhou, Tianchen
and
Zhao, Yecheng
and
Yao, Li
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241301}
}
@inproceedings{
10.2312:pg.20241302,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Fast Approximation to Large-Kernel Edge-Preserving Filters by Recursive Reconstruction from Image Pyramids}},
author = {
Xu, Tianchen
and
Yang, Jiale
and
Qin, Yiming
and
Sheng, Bin
and
Wu, Enhua
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241302}
}
@inproceedings{
10.2312:pg.20241303,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
P-NLOS: A Prompt-Based Method for Robust NLOS Imaging}},
author = {
Su, Xiongfei
and
Zhu, Tianyi
and
Liu, Lina
and
Zhang, Yuanlong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241303}
}
@inproceedings{
10.2312:pg.20241304,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
A Contrastive Unified Encoding Framework for Sticker Style Editing}},
author = {
Ni, Zhihong
and
Li, Chengze
and
Liu, Hanyuan
and
Liu, Xueting
and
Wong, Tien-Tsin
and
Wen, Zhenkun
and
Wu, Huisi
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241304}
}
@inproceedings{
10.2312:pg.20241305,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
DViTGAN: Training ViTGANs with Diffusion}},
author = {
Tong, Mengjun
and
Rao, Hong
and
Yang, Wenji
and
Chen, Shengbo
and
Zuo, Fang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241305}
}
@inproceedings{
10.2312:pg.20241306,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Computational Mis-Drape Detection and Rectification}},
author = {
Shin, Hyeon-Seung
and
Ko, Hyeong-Seok
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241306}
}
@inproceedings{
10.2312:pg.20241307,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Self-Supervised Multi-Layer Garment Animation Generation Network}},
author = {
Han, Guoqing
and
Shi, Min
and
Mao, Tianlu
and
Wang, Xinran
and
Zhu, Dengming
and
Gao, Lin
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241307}
}
@inproceedings{
10.2312:pg.20241308,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
SLGDiffuser : Stroke-level Guidance Diffusion Model for Complex Scene Text Editing}},
author = {
Liu, Xiao Le
and
Wu, Lei
and
Wang, Chang Shuo
and
Dong, Pei
and
Meng, Xiang Xu
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241308}
}
@inproceedings{
10.2312:pg.20241309,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Modeling Sketches both Semantically and Structurally for Zero-Shot Sketch-Based Image Retrieval is Better}},
author = {
Jing, Jiansen
and
Liu, Yujie
and
Li, Mingyue
and
Xiao, Qian
and
Chai, Shijie
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241309}
}
@inproceedings{
10.2312:pg.20241310,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Editing Compact Voxel Representations on the GPU}},
author = {
Molenaar, Mathijs
and
Eisemann, Elmar
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241310}
}
@inproceedings{
10.2312:pg.20241311,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping}},
author = {
Cai, Zeyu
and
Wang, Duotun
and
Liang, Yixun
and
Shao, Zhijing
and
Chen, Ying-Cong
and
Zhan, Xiaohang
and
Wang, Zeyu
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241311}
}
@inproceedings{
10.2312:pg.20241312,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
High-Quality Cage Generation Based on SDF}},
author = {
Qiu, Hao
and
Liao, Wentao
and
Chen, Renjie
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241312}
}
@inproceedings{
10.2312:pg.20241313,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
GGAvatar: Dynamic Facial Geometric Adjustment for Gaussian Head Avatar}},
author = {
Li, Xinyang
and
Wang, Jiaxin
and
Xuan, Yixin
and
Yao, Gongxin
and
Pan, Yu
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241313}
}
@inproceedings{
10.2312:pg.20241314,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Enhancing Human Optical Flow via 3D Spectral Prior}},
author = {
Mao, Shiwei
and
Sun, Mingze
and
Huang, Ruqi
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241314}
}
@inproceedings{
10.2312:pg.20241315,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction}},
author = {
Yan, Haodong
and
Hu, Zhiming
and
Schmitt, Syn
and
Bulling, Andreas
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241315}
}
@inproceedings{
10.2312:pg.20241316,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
GamePose: Self-Supervised 3D Human Pose Estimation from Multi-View Game Videos}},
author = {
Zhou, Yang
and
Guo, Tianze
and
Xu, Hao
and
Wei, Xilei
and
Xu, Lang
and
Tang, Xiangjun
and
Yang, Sipeng
and
Kou, Qilong
and
Jin, Xiaogang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241316}
}
@inproceedings{
10.2312:pg.20241317,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Feature Separation Graph Convolutional Networks for Skeleton-Based Action Recognition}},
author = {
Zhang, Lingyan
and
Ling, Wanyu
and
Daizhou, Shuwen
and
Kuang, Li
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241317}
}
@inproceedings{
10.2312:pg.20241318,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
High-Quality Geometry and Texture Editing of Neural Radiance Field}},
author = {
Kim, Soongjin
and
Son, Jooeun
and
Ju, Gwangjin
and
Lee, Joo Ho
and
Lee, Seungyong
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241318}
}
@inproceedings{
10.2312:pg.20241319,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis}},
author = {
Liu, Ruiqi
and
Zheng, Peng
and
Wang, Ye
and
Ma, Rui
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241319}
}
@inproceedings{
10.2312:pg.20241320,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
3DStyleGLIP: Part-Tailored Text-Guided 3D Neural Stylization}},
author = {
Chung, SeungJeh
and
Park, JooHyun
and
Kang, HyeongYeop
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241320}
}
@inproceedings{
10.2312:pg.20241321,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
CKD-LQPOSE: Towards a Real-World Low-quality Cross-Task Distilled Pose Estimation Architecture}},
author = {
Liu, Tao
and
Yao, Beiji
and
Huang, Jun
and
Wang, Ya
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241321}
}
@inproceedings{
10.2312:pg.20241322,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Colorectal Protrusions Detection based on Conformal Colon Flattening}},
author = {
Ren, Yuxue
and
Hu, Wei
and
Li, Zhengbin
and
Chen, Wei
and
Lei, Na
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241322}
}
@inproceedings{
10.2312:pg.20241323,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
A Fiber Image Classification Strategy Based on Key Module Localization}},
author = {
Ji, Ya Tu
and
Xue, Xiang
and
Dai, L. J.
and
Yao, M. M.
and
Li, X. M.
and
Liu, Yang
and
Xu, H. T.
and
Ren, Q. D. E. J.
and
Shi, B.
and
Wu, N. E.
and
Lu, M.
and
Xu, X. X.
and
Wang, L.
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241323}
}
@inproceedings{
10.2312:pg.20241324,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Img2PatchSeqAD: Industrial Image Anomaly Detection Based on Image Patch Sequence}},
author = {
Liu, Yang
and
Ji, Ya Tu
and
Wang, L.
and
Dai, L. J.
and
Yao, Miao Miao
and
Li, Xiao Mei
and
Xue, Xiang
and
Xu, H. T.
and
Ren, Qing Dao Er Ji
and
Shi, Bao
and
Wu, N. E.
and
Lu, M.
and
Xu, Xuan Xuan
and
Guo, H. X.
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241324}
}
@inproceedings{
10.2312:pg.20241325,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
``Yunluo Journey'': A VR Cultural experience for the Chinese Musical Instrument}},
author = {
Wang, Yuqiu
and
Guo, Wenchen
and
He, Zhiting
and
Fan, Min
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241325}
}
@inproceedings{
10.2312:pg.20241326,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Simulating Viscous Fluid Using Free Surface Lattice Boltzmann Method}},
author = {
Sun, Dakun
and
Gao, Yang
and
Xie, Xueguang
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241326}
}
@inproceedings{
10.2312:pg.20241327,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
SPDD-YOLO for Small Object Detection in UAV Images}},
author = {
Xue, Xiang
and
Ji, Ya Tu
and
Liu, Yang
and
Xu, H. T.
and
Ren, Q. D. E. J.
and
Shi, B.
and
Wu, N. E.
and
Lu, M.
and
Zhuang, X. F.
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241327}
}
@inproceedings{
10.2312:pg.20241328,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
Self-Supervised Multi-Layer Garment Animation Generation Network}},
author = {
Han, Guoqing
and
Shi, Min
and
Mao, Tianlu
and
Wang, Xinran
and
Zhu, Dengming
and
Gao, Lin
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241328}
}
@inproceedings{
10.2312:pg.20241329,
booktitle = {
Pacific Graphics Conference Papers and Posters},
editor = {
Chen, Renjie
and
Ritschel, Tobias
and
Whiting, Emily
}, title = {{
CNCUR : A simple 2D Curve Reconstruction Algorithm based on constrained neighbours}},
author = {
Antony, Joms
and
Reghunath, Minu
and
Muthuganapathy, Ramanathan
}, year = {
2024},
publisher = {
The Eurographics Association},
ISBN = {978-3-03868-250-9},
DOI = {
10.2312/pg.20241329}
}

Browse

Recent Submissions

Now showing 1 - 57 of 57
  • Item
    Pacific Graphics 2024 - Conference Papers and Posters: Frontmatter
    (The Eurographics Association, 2024) Chen, Renjie; Ritschel, Tobias; Whiting, Emily; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
  • Item
    MGS-SLAM: Dense RGB-D SLAM via Multi-level Gaussian Splatting
    (The Eurographics Association, 2024) Wang, Xu; Liu, Ying; Chen, Xiaojun; Wu, Jialin; Zhang, Xiaohao; Li, Ruihui; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Simultaneous localization and mapping (SLAM) are key technologies for scene perception, localization, and map construction. 3D Gaussian Splatting (3DGS), as a powerful method for geometric and appearance representation, has brought higher performance to SLAM systems. However, the existing methods based on 3D Gaussian representation use the single level of 3D Gaussian for the entire scene, resulting in their inability to effectively capture the geometric shapes and texture details of all objects in the scene. In this work, we propose a monocular dense RGB-D SLAM system that integrates multi-level features, which is achieved by using different levels of Gaussians to separately reconstruct geometric shapes and texture details. Specifically, through the Fourier transform, we capture the geometric shapes (low frequency) and texture details (high frequency) of the scene in the frequency domain, serving as the initial conditions for the Gaussian distribution. Additionally, to address the issue of different rendering outcomes (such as specular reflections) for the same 3D Gaussian under different viewpoints, we have integrated local adaptation Gaussian and local optimization techniques to compensate the discrepancies introduced by the 3D Gaussian across different viewpoints. Extensive quantitative and qualitative results demonstrate that our method outperforms the state-of-the-art methods.
  • Item
    Semantics-Augmented Quantization-Aware Training for Point Cloud Classification
    (The Eurographics Association, 2024) Huang, Liming; Qin, Yunchuan; Li, Ruihui; Wu, Fan; Li, Kenli; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Point cloud classification is a pivotal procedure in 3D computer vision, and its deployment in practical applications is often constrained by limited computational and memory resources. To address these issues, we introduce a Semantics-Augmented Quantization-Aware Training (SAQAT) framework designed for efficient and precise classification of point cloud data. The SAQAT framework incorporates a point importance prediction semantic module as a side output, which assists in identifying crucial points, along with a point importance evaluation algorithm (PIEA). The semantics module leverages point importance prediction to skillfully select quantization levels based on local geometric properties and semantic context. This approach reduces errors by retaining essential information. In synergy, the PIEA acts as the cornerstone, providing an additional layer of refinement to SAQAT framework. Furthermore, we integrates a loss function that mitigates classification loss, quantization error, and point importance prediction loss, thereby fostering a reliable representation of the quantized data. The SAQAT framework is designed for seamless integration with existing point cloud models, enhancing their efficiency while maintaining high levels of accuracy. Testing on benchmark datasets demonstrates that our SAQAT framework surpasses contemporary quantization methods in classification accuracy while simultaneously economizing on memory and computational resources. Given these advantages, our SAQAT framework holds enormous potential for a wide spectrum of applications within the rapidly evolving domain of 3D computer vision. Our code is released: https://github.com/h-liming/SAQAT.
  • Item
    PointJEM: Self-supervised Point Cloud Understanding for Reducing Feature Redundancy via Joint Entropy Maximization
    (The Eurographics Association, 2024) Cao, Xin; Xia, Huan; Wang, Haoyu; Su, Linzhi; Zhou, Ping; Li, Kang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Most deep learning methods for point cloud processing are supervised and require extensive labeled data. However, labeling point cloud data is a tedious and time-consuming task. Self-supervised representation learning can solve this problem by extracting robust and generalized features from unlabeled data. Yet, the features from representation learning are often redundant. Current methods typically reduce redundancy by imposing linear correlation constraints. In this paper, we introduce PointJEM, a self-supervised representation learning method for point clouds. It includes an embedding scheme that divides the vector into parts, each learning a unique feature. To minimize redundancy, PointJEM maximizes joint entropy between parts, making the features pairwise independent. We tested PointJEM on various datasets and found it significantly reduces redundancy beyond linear correlation. Additionally, PointJEM performs well in downstream tasks like classification and segmentation.
  • Item
    Label Name is Mantra: Unifying Point Cloud Segmentation across Heterogeneous Datasets
    (The Eurographics Association, 2024) Liang, Yixun; He, Hao; Xiao, Shishi; Lu, Hao; Chen, Yingcong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Point cloud segmentation is a fundamental task in 3D vision that serves a wide range of applications. Despite recent advancements, its practical usability is still limited by the availability of training data. The prevalent methodologies cannot optimally exploit multiple datasets due to the inconsistency of labels across datasets. In this work, we introduce a robust method that accommodates learning from diverse datasets with variant label sets. We leverage a pre-trained language model to map discrete labels into a continuous latent space using their semantic names. This harmonizes labels across datasets, facilitating concurrent training. Contrarily, when classifying points within the continuous 3D space via their linguistic tokens, our model exhibits superior generalizability compared to extant methods with fixed decoder structures. Further, our approach assimilates prompt learning to alleviate data shifts across sources. Comprehensive evaluations attest that our model markedly surpasses current benchmarks.
  • Item
    Deep-PE: A Learning-Based Pose Evaluator for Point Cloud Registration
    (The Eurographics Association, 2024) Gao, Junjie; Wang, Chongjian; Ding, Zhongjun; Chen, Shuangmin; Xin, Shiqing; Tu, Changhe; Wang, Wenping; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    In the realm of point cloud registration, the most prevalent pose evaluation approaches are statistics-based, identifying the optimal transformation by maximizing the number of consistent correspondences. However, registration recall decreases significantly when point clouds exhibit a low overlap ratio, despite efforts in designing feature descriptors and establishing correspondences. In this paper, we introduce Deep-PE, a lightweight, learning-based pose evaluator designed to enhance the accuracy of pose selection, especially in challenging point cloud scenarios with low overlap. Our network incorporates a Pose-Aware Attention (PAA) module to simulate and learn the alignment status of point clouds under various candidate poses, alongside a Pose Confidence Prediction (PCP) module that predicts the likelihood of successful registration. These two modules facilitate the learning of both local and global alignment priors. Extensive tests across multiple benchmarks confirm the effectiveness of Deep-PE. Notably, on 3DLoMatch with a low overlap ratio, Deep-PE significantly outperforms state-of-the-art methods by at least 8% and 11% in registration recall under handcrafted FPFH and learning-based FCGF descriptors, respectively. To the best of our knowledge, this is the first study to utilize deep learning to select the optimal pose without the explicit need for input correspondences.
  • Item
    PVP-SSD: Point-Voxel Fusion with Partitioned Point Cloud Sampling for Anchor-Free Single-Stage Small 3D Object Detection
    (The Eurographics Association, 2024) Wu, Xinlin; Tian, Yibin; Pan, Yin; Zhang, Zhiyuan; Wu, Xuesong; Wang, Ruisheng; Zeng, Zhi; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Single-stage object detection from 3D point clouds in autonomous driving faces significant challenges, particularly in accurately detecting small objects. To address this issue, we propose a novel method called Point-Voxel dual-branch feature extraction with Partitioned point cloud sampling for anchor-free Single-Stage Detection of 3D objects (PVP-SSD). The network comprises two branches: a point branch and a voxel branch. In the point branch, a partitioned point cloud sampling strategy leverages axial features to divide the point cloud. Then, it assigns different sampling weights to various segments to enhance the sampling accuracy. Additionally, a local feature enhancement module explicitly calculates the correlation between key points and query points, improving the extraction of local features. In the voxel branch, we use 3D sparse convolution to extract instance structural features efficiently. The point-voxel dual-branch fusion dynamically integrates instance features extracted from both branches using a self-attention mechanism, which contains not only the category information of the detected object but also the spatial dimensions and heading angle. Consequently, PVP-SSD achieves a certain balance between preserving detailed information and maintaining structural integrity. Experimental results on the KITTI and ONCE datasets demonstrate that PVP-SSD excels in multi-category small 3D object detection.
  • Item
    StegaVideo: Robust High-Resolution Video Steganography with Temporal and Edge Guidance
    (The Eurographics Association, 2024) Hu, Kun; Hu, Zixuan; Zhu, Qianhui; Wang, Xiaochao; Wang, Xingjun; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Current video steganography frameworks have difficulties in balancing robustness and imperceptibility at high resolution. To achieve better video coherence, robustness, and invisibility, we propose an efficient high-resolution video steganography method, named StegaVideo, that utilizes temporal guidance and edge guidance techniques. StegaVideo particularly focuses on concentrating the embedding message in the edge region to enhance invisibility, achieving a Peak Signal to Noise Ratio (PSNR) value of over 38 dB. We simulate various attacks to enhance robustness, with an average bit accuracy of above 99.5%. We use a faster embedding and extracting network, resulting in a 10× improvement in inference speed. Our method outperforms current leading video steganography systems in terms of efficiency, robustness, resolution, and inference speed, as demonstrated by the experiment. Our code will be publicly available at https://github.com/LittleFocus2201/StegaVideo.
  • Item
    Single Image 3D Reconstruction of Creased Documents Using Shape-from-Shading with Template-Based Error Correction
    (The Eurographics Association, 2024) Wang, Linqin; Bo, Pengbo; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    We present a method for reconstructing 3D models from single images of creased documents by enhancing the linear shapefrom- shading (SFS) technique with a template-based error correction mechanism. This mechanism is based on a mapping function established using precise data from a spherical surface modeled with linearized Lambertian shading. The error correction mapping is integrated into an algorithm that refines reconstructed depth values during the image scanning process. To resolve the inherent concave/convex ambiguities in SFS, we identify specific conditions based on assumed lighting and the geometric characteristics of creased documents, effectively improving reconstruction even in less controlled lighting environments. Our approach captures intricate geometric details on non-smooth surfaces. Comparative results demonstrate that our method provides superior accuracy and efficiency in reconstructing complex features such as creases and wrinkles.
  • Item
    GCANet: A Geometric Consistency-driven Aggregation Network for Robust Primitive Segmentation on Point Clouds
    (The Eurographics Association, 2024) Huang, Anyi; Li, Zikuan; Wang, Zhoutao; Wu, Xiang; Wang, Jun; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Primitive segmentation aims to decompose a 3D point cloud into parametric surface patches, which is a common task in 3D measurement. Existing methods primarily learn point cloud feature embedding through neural networks and then perform feature clustering to generate segmentation results. Since spatial relationships are not considered, these methods often exhibit poor generalization to noisy real-scan point clouds. To address this problem, this paper proposes a geometric consistency-driven aggregation network (GCANet) that performs spatial aggregation of primitive points driven by a designed geometric consistency feature (GCF). We also design a direction-aware offset prediction module to improve centroid offset prediction accuracy. More specifically, we leverage the GCF to search for geometric consistency points and then construct the direction-aware feature to guide centroid offset prediction. Experimental results on the ABCParts dataset show that our method achieves competitive performance compared to state-of-the-art (SOTA) methods. Moreover, the SOTA results on the noisy ABCParts dataset validate the strong generalization ability of our GCANet. Our code is publicly available at https://github.com/hay-001/GCANet.
  • Item
    Mesh Slicing Along Isolines of Surface-Based Functions
    (The Eurographics Association, 2024) Wang, Lei; Wang, Xudong; Wang, Wensong; Chen, Shuangmin; Xin, Shiqing; Tu, Changhe; Wang, Wenping; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    There are numerous practical scenarios where the surface of a 3D object is equipped with varying properties. The process of slicing the surface along the isoline of the property field is a widely utilized operation. While the geometry of the 3D object can typically be approximated with a piecewise linear triangle mesh, the property field f might be too intricate to be linearly approximated at the same resolution. Arbitrarily reducing the isoline within a triangle into a straight-line segment could result in noticeable artifacts. In this paper, we delve into the precise extraction of the isoline of a surface-based function f for slicing the surface apart, allowing the extracted isoline to be curved within a triangle. Our approach begins by adequately sampling Steiner points on mesh edges. Subsequently, for each triangle, we categorize the Steiner points into two groups based on the signs of their function values. We then trace the bisector between these two groups of Steiner points by simply computing a 2D power diagram of all Steiner points. It's worth noting that the weight setting of the power diagram is derived from the first-order approximation of f . Finally, we refine the polygonal bisector by adjusting each vertex to the closest point on the actual isoline. Each step of our algorithm is fully parallelizable on a triangle level, making it highly efficient. Additionally, we provide numerous examples to illustrate its practical applications.
  • Item
    Geodesic Distance Propagation Across Open Boundaries
    (The Eurographics Association, 2024) Chen, Shuangmin; Yue, Zijia; Wang, Wensong; Xin, Shiqing; Tu, Changhe; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    The computation of geodesic distances on curved surfaces stands as a fundamental operation in digital geometry processing. Throughout distance propagation, each surface point assumes the dual role of a receiver and transmitter. Despite substantial research on watertight triangle meshes, algorithms designed for broken surfaces-those afflicted with open-boundary defects-remain scarce. Current algorithms primarily focus on bridging holes and gaps in the embedding space to facilitate distance propagation across boundaries but fall short in addressing large open-boundary defects in highly curved regions. In this paper, we delve into the prospect of inferring defect-tolerant geodesics exclusively within the intrinsic space. Our observation reveals that open-boundary defects can give rise to a ''shadow'' region, where the shortest path touches open boundaries. Based o n such an observation, we have made three key adaptations to the fast marching method (FMM). Firstly, boundary points now exclusively function as distance receivers, impeding any further distance propagation. Secondly, bidirectional distance propagation is permitted, allowing the prediction of geodesic distances in the shadow region based on those in the visible region (even if the visual region is a little more distant from the source). Lastly, we have redefined priorities to harmonize distance propagation between the shadow and visible regions. Notably intrinsic, our algorithm distinguishes itself from existing counterparts. Experimental results showcase its exceptional performance and accuracy, even in the presence of large and irregular open boundaries.
  • Item
    TPAM: Transferable Perceptual-constrained Adversarial Meshes
    (The Eurographics Association, 2024) Kang, Tengjia; Li, Yuezun; Zhou, Jiaran; Xin, Shiqing; Dong, Junyu; Tu, Changhe; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Triangle meshes are widely used in 3D data representation due to their efficacy in capturing complex surfaces. Mesh classification, crucial in various applications, has typically been tackled by Deep Neural Networks (DNNs) with advancements in deep learning. However, these mesh networks have been proven vulnerable to adversarial attacks, where slight distortions to meshes can cause large prediction errors, posing significant security risks. Although several mesh attack methods have been proposed recently, two key aspects of Stealthiness and Transferability remain underexplored. This paper introduces a new method called Transferable Perceptual-constrained Adversarial Meshes (TPAM) to investigate these aspects in adversarial attacks further. Specifically, we present a Perceptual-constrained objective term to restrict the distortions and introduce an Adaptive Geometry-aware Attack Optimization strategy to adjust attacking strength iteratively based on local geometric frequencies, striking a good balance between stealthiness and attacking accuracy. Moreover, we propose a Bayesian Surrogate Network to enhance transferability and introduce a new metric, the Area Under Accuracy (AUACC), for comprehensive performance evaluation. Experiments on various mesh classifiers demonstrate the effectiveness of our method in both white-box and black-box settings, enhancing the attack stealthiness and transferability across multiple networks. Our research can enhance the understanding of DNNs, thus improving the robustness of mesh classifiers. The code is available at https://github.com/Tengjia-Kang/TPAM.
  • Item
    Biophysically-based Simulation of Sun-induced Skin Appearance Changes
    (The Eurographics Association, 2024) He, Xueyan; Huang, Minghao; Fu, Ruoyu; Guo, Jie; Yuan, Junping; Wang, Yanghai; Guo, Yanwen; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Skin appearance modeling plays a crucial role in various fields such as healthcare, cosmetics and entertainment. However, the structure of the skin and its interaction with environmental factors like ultraviolet radiation are very complex and require further detailed modeling. In this paper, we propose a biophysically-based model to illustrate the changes in skin appearance under ultraviolet radiation exposure. It takes ultraviolet doses and specific biophysical parameters as inputs, leading to variations in melanin and blood concentrations, as well as the growth rate of skin cells. These changes bring alteration of light scattering, which is simulated by random walk method, and result in observable erythema and tanning. We showcase effects of various skin tones, comparisons across different body parts, and images illustrating the impact of occlusion. It demonstrates superior quality to the the commonly used method with more convincing skin details and bridges biological insights with visual simulations.
  • Item
    Data Parallel Ray Tracing of Massive Scenes based on Neural Proxy
    (The Eurographics Association, 2024) Xu, Shunkang; Xu, Xiang; Xu, Yanning; Wang, Lu; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Data-parallel ray tracing is an important method for rendering massive scenes that exceed local memory. Nevertheless, its efficacy is markedly contingent upon bandwidth owing to the substantial ray data transfer during the rendering process. In this paper, we advance the utilization of neural representation geometries in data-parallel rendering to reduce ray forwarding and intersection overheads. To this end, we introduce a lightweight geometric neural representation, denoted as a ''neural proxy.'' Utilizing our neural proxies, we propose an efficient data-parallel ray tracing framework that significantly minimizes ray transmission and intersection overheads. Compared to state-of-the-art approaches, our method achieved a 2.29∼ 3.36× speedup with an almost imperceptible image quality loss.
  • Item
    Learning-based Self-Collision Avoidance in Retargeting using Body Part-specific Signed Distance Fields
    (The Eurographics Association, 2024) Lee, Junwoo; Kim, Hoimin; Kwon, Taesoo; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Motion retargeting is a technique for applying the motion of one character to a new character. Differences in shapes and proportions between characters can cause self-collisions during the retargeting process. To address this issue, we propose a new collision resolution strategy comprising three key components: a collision detection module, a self-collision resolution model, and a training strategy for the collision resolution model. The collision detection module generates collision information based on changes in posture. The self-collision resolution model, which is based on a neural network, uses this collision information to resolve self-collisions. The proposed training strategy enhances the performance of the self-collision resolution model. Compared to previous studies, our self-collision resolution process demonstrates superior performance in terms of accuracy and generalization. Our model reduces the average penetration depth across the entire body by 56%, which is 28% better than the previous studies. Additionally, the minimum distance from the end-effectors to the skin averaged 2.65cm, which is more than 0.8cm smaller than in the previous studies. Furthermore, it takes an average of 7.9ms to solve one frame, enabling online real-time self-collision resolution.
  • Item
    PhysHand: A Hand Simulation Model with Physiological Geometry, Physical Deformation, and Accurate Contact Handling
    (The Eurographics Association, 2024) Sun, Mingyang; Kou, Dongliang; Yuan, Ruisheng; Yang, Dingkang; Zhai, Peng; Zhao, Xiao; Jiang, Yang; Li, Xiong; Li, Jingchen; Zhang, Lihua; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    In virtual Hand-Object Interaction (HOI) scenarios, the authenticity of the hand's deformation is important to immersive experience, such as natural manipulation or tactile feedback. Unrealistic deformation arises from simplified hand geometry, neglect of the different physics attributes of the hand, and penetration due to imprecise contact handling. To address these problems, we propose PhysHand, a novel hand simulation model, which enhances the realism of deformation in HOI. First, we construct a physiologically plausible geometry, a layered mesh with a ''skin-flesh-skeleton'' structure. Second, to satisfy the distinct physics features of different soft tissues, a constraint-based dynamics framework is adopted with carefully designed layer-corresponding constraints to maintain flesh attached and skin smooth. Finally, we employ an SDF-based method to eliminate the penetration caused by contacts and enhance its accuracy by introducing a novel multi-resolution querying strategy. Extensive experiments have been conducted to demonstrate the outstanding performance of PhysHand in calculating deformations and handling contacts. Compared to existing methods, our PhysHand: 1) can compute both physiologically and physically plausible deformation; 2) significantly reduces the depth and count of penetration in HOI.
  • Item
    Audio-Driven Speech Animation with Text-Guided Expression
    (The Eurographics Association, 2024) Jung, Sunjin; Chun, Sewhan; Noh, Junyong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    We introduce a novel method for generating expressive speech animations of a 3D face, driven by both audio and text descriptions. Many previous approaches focused on generating facial expressions using pre-defined emotion categories. In contrast, our method is capable of generating facial expressions from text descriptions unseen during training, without limitations to specific emotion classes. Our system employs a two-stage approach. In the first stage, an auto-encoder is trained to disentangle content and expression features from facial animations. In the second stage, two transformer-based networks predict the content and expression features from audio and text inputs, respectively. These features are then passed to the decoder of the pre-trained auto-encoder, yielding the final expressive speech animation. By accommodating diverse forms of natural language, such as emotion words or detailed facial expression descriptions, our method offers an intuitive and versatile way to generate expressive speech animations. Extensive quantitative and qualitative evaluations, including a user study, demonstrate that our method can produce natural expressive speech animations that correspond to the input audio and text descriptions.
  • Item
    Continuous Representation based Internal Self-supporting Structure via Ellipsoid Hollowing for 3D Printing
    (The Eurographics Association, 2024) Wang, Shengfa; Yang, Jun; Hu, Jiangbei; Lei, Na; Luo, Zhongxuan; Liu, Ligang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Hollowing is an effective way to achieve lightweight objectives by removing material from the interior volume while maintaining feasible mechanical properties. However, hollowed models often necessitate the use of additional support materials to prevent collapse during the printing process, which can substantially negate the benefits of weight reduction. We introduce a framework for designing and optimizing self-supporting infill cavities, which are represented and optimized directly using continuous functions based on ellipsoids. Ellipsoids are favored as filling structures due to their advantageous properties, including their self-supporting nature, precise mathematical definability, variable controllability, and stress concentration mitigation capabilities. Thanks to the explicit definability, we formulate the creation of self-supporting infill cavities as a structural stiffness optimization problem using function representations. The utilization of function representation eliminates the necessity for remeshing to depict structures and shapes, thereby enabling the direct computation of integrals and gradients on the functions. Based on the representations, we propose an efficient optimization strategy to determine the shapes, positions, and topology of the infill cavities, with the goal of achieving multiple objectives, including minimizing material cost, maximizing structural stiffness, and ensuring self-supporting. We perform various experiments to validate the effectiveness and convergence of our approach. Moreover, we demonstrate the self-supporting and stability of the optimized structures through actual 3D printing trials and real mechanical testing.
  • Item
    Convex Hull Computation in a Grid Space: A GPU Accelerated Parallel Filtering Approach
    (The Eurographics Association, 2024) Antony, Joms; Mukundan, Manoj Kumar; Thomas, Mathew; Muthuganapathy, Ramanathan; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Many real-world applications demand the computation of a convex hull (CH) when the input points originate from structured configurations such as two-dimensional (2D) or three-dimensional (3D) grids. Convex hull in grid space has found applications in geographic information systems, medical data analysis, path planning for robots/autonomous vehicles etc. Conventional as well as existing GPU-accelerated algorithms available for CH computation cannot operate directly on 2D or 3D grids represented in matrix format and do not exploit the inherent sequential ordering in such rasterized representations. This work introduces novel filtering algorithms, initially developed for a 2D grid space and subsequently extended to 3D to speed up the hull computation. They are further extended as GPU-CPU hybrid algorithms and are implemented and evaluated on a commercial NVIDIA GPU. For a 2D grid, the number of contributing pixels is always restricted to ≤ 2n for an (n×n) grid. Moreover, they are extracted in lexicographic order, ensuring an efficient O(n) computation of CH. Similarly, in 3D, the number of contributing voxels is always limited to ≤ 2n2 for an (n×n×n) voxel matrix. Additionally, 2D CH filtering is enabled across all slices of the 3D grid in parallel, leading to a further reduction in the number of contributing voxels to be fed to the 3D CH computation procedure. Comparison with the state of the art indicated that our method is superior, especially for large and sparse point clouds.
  • Item
    Real-Time Rendering of Glints in the Presence of Area Lights
    (The Eurographics Association, 2024) Kneiphof, Tom; Klein, Reinhard; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Many real-world materials are characterized by a glittery appearance. Reproducing this effect in physically based renderings is a challenging problem due to its discrete nature, especially in real-time applications which require a consistently low runtime. Recent work focuses on glittery appearance illuminated by infinitesimally small light sources only. For light sources like the sun this approximation is a reasonable choice. In the real world however, all light sources are fundamentally area light sources. In this paper, we derive an efficient method for rendering glints illuminated by spatially constant diffuse area lights in real time. To this end, we require an adequate estimate for the probability of a single microfacet to be correctly oriented for reflection from the source to the observer. A good estimate is achieved either using linearly transformed cosines (LTC) for large light sources, or a locally constant approximation of the normal distribution for small spherical caps of light directions. To compute the resulting number of reflecting microfacets, we employ a counting model based on the binomial distribution. In the evaluation, we demonstrate the visual accuracy of our approach, which is easily integrated into existing real-time rendering frameworks, especially if they already implement shading for area lights using LTCs and a counting model for glint shading under point and directional illumination. Besides the overhead of the preexisting constituents, our method adds little to no additional overhead.
  • Item
    Inverse Rendering of Translucent Objects with Shape-Adaptive Importance Sampling
    (The Eurographics Association, 2024) Son, Jooeun; Jung, Yucheol; Lee, Gyeongmin; Kim, Soongjin; Lee, Joo Ho; Lee, Seungyong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Subsurface scattering is ubiquitous in organic materials and has been widely researched in computer graphics. Inverse rendering of subsurface scattering, however, is often constrained by the planar geometry assumption of traditional analytic Bidirectional Surface Scattering Reflectance Distribution Functions (BSSRDF). To address this issue, a shape-adaptive BSSRDF model has been proposed to render translucent objects on curved geometry with high accuracy. In this paper, we leverage this model to estimate parameters of subsurface scattering for inverse rendering. We compute the finite difference of the rendering equation for subsurface scattering and iteratively update material parameters. We demonstrate the performance of our shapeadaptive inverse rendering model by analyzing the estimation accuracy and comparing to inverse rendering with plane-based BSSRDF models and volumetric methods.
  • Item
    Dense Crowd Motion Prediction through Density and Trend Maps
    (The Eurographics Association, 2024) Wang, Tingting; Fu, Qiang; Wang, Minggang; Bi, Huikun; Deng, Qixin; Deng, Zhigang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    In this paper we propose a novel density/trend map based method to predict both group behavior and individual pedestrian motion from video input. Existing motion prediction methods represent pedestrian motion as a set of spatial-temporal trajectories; however, besides such a per-pedestrian representation, a high-level representation for crowd motion is often needed in many crowd applications. Our method leverages density maps and trend maps to represent the spatial-temporal states of dense crowds. Based on such representations, we propose a crowd density map net that extracts a density map from a video clip, and a crowd prediction net that utilizes the historical states of a video clip to predict density maps and trend maps for future frames. Moreover, since the crowd motion consists of the motion of individual pedestrians in a group, we also leverage the predicted crowd motion as a clue to improve the accuracy of traditional trajectory-based motion prediction methods. Through a series of experiments and comparisons with state-of-the-art motion prediction methods, we demonstrate the effectiveness and robustness of our method.
  • Item
    Free-form Floor Plan Design using Differentiable Voronoi Diagram
    (The Eurographics Association, 2024) Wu, Xuanyu; Tojo, Kenji; Umetani, Nobuyuki; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Designing floor plans is difficult because various constraints must be satisfied by the layouts of the internal walls. This paper presents a novel shape representation and optimization method for designing floor plans based on the Voronoi diagrams. Our Voronoi diagram implicitly specifies the shape of the room using the distance from the Voronoi sites, thus facilitating the topological changes in the wall layout by moving these sites. Since the differentiation of the explicit wall representation is readily available, our method can incorporate various constraints, such as room areas and room connectivity, into the optimization. We demonstrate that our method can generate various floor plans while allowing users to interactively change the constraints.
  • Item
    Trigonometric Tangent Interpolating Curves
    (The Eurographics Association, 2024) Ramanantoanina, Andriamahenina; Hormann, Kai; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Due to their favourable properties, cubic B-spline curves are the de facto standard for modelling closed curves in computer graphics and computer-aided design. Their shapes can be modified intuitively by moving the vertices of a control polygon, but they are only twice differentiable at the knots. Even though this is sufficient for most applications, curves with higher smoothness are still of valuable interest. For example, periodic Bézier curves provide an alternative for designing closed curves as C∞ smooth trigonometric polynomials, but their shapes are not as intuitive to control, because of the global influence of each control point. The same space of curves can also be described in vertex interpolating form, but this may result in other shape artefacts. In this paper we introduce two new representations of trigonometric polynomial curves that are inspired by the idea behind polynomial Gauss-Legendre curves and likewise use the control polygon for controlling the tangents of the curves. The first variant gives curves that closely follow the control polygon, and the curves generated with the second variant are less tied to the control polygon and instead very similar to uniform cubic B-spline curves.
  • Item
    Physics-Informed Neural Fields with Neural Implicit Surface for Fluid Reconstruction
    (The Eurographics Association, 2024) Duan, Zheng; Ren, Zhong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Recovering fluid density and velocity from multi-view RGB videos poses a formidable challenge. Existing solutions typically assume knowledge of obstacles and lighting, or are designed for simple fluid scenes without obstacles or complex lighting. Addressing these challenges, our study presents a novel hybrid model named PINFS, which ingeniously fuses the capabilities of Physics-Informed Neural Fields (PINF) and Neural Implicit Surfaces (NeuS) to accurately reconstruct scenes containing smoke. By combining the capabilities of SIREN-NeRFt in PINF for creating realistic smoke representations and the accuracy of NeuS in depicting solid obstacles, PINFS excels in providing detailed reconstructions of smoke scenes with improved visual authenticity and physical precision. PINFS distinguishes itself by incorporating solid's view-independent opaque density and addressing Neumann boundary conditions through signed distances from NeuS. This results in a more realistic and physically plausible depiction of smoke behavior in dynamic scenarios. Comprehensive evaluations of synthetic and real-world datasets confirm the model's superior performance in complex scenes with obstacles. PINFS introduces a novel framework for realistically and physically consistent rendering of complex fluid dynamics scenarios, pushing the boundaries in the utilization of mixed physical and neural-based approaches. The code is available at https://github.com/zduan3/pinfs_code.
  • Item
    Fast Wavelet-domain Smoke Guiding
    (The Eurographics Association, 2024) Lyu, Luan; Ren, Xiaohua; Wu, Enhua; Yang, Zhi-Xin; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    We propose a simple and efficient wavelet-based method to guide smoke simulation with specific velocity fields. This method primarily uses wavelets to combine low-resolution velocities with high-resolution details for smoke guiding. Due to the natural ability of wavelets to divide data into different frequency bands, we can merge low and high-resolution velocities by replacing wavelet coefficients. Compared to Fourier methods, the wavelet transform can use wavelets with shorter, compact supports, making the transformation faster and more adaptable to various boundary conditions. The method has a time complexity of O(n) and a memory complexity of n. Additionally, wavelets are compactly supported, which allows us to locally filter out or retain details by editing the wavelet coefficients. This enables us to locally edit smoke. Moreover, to accelerate the performance of wavelet transforms on GPUs, we propose a technique implemented in CUDA called in-kernel warp-level wavelet transform computation. This technique utilizes warp-level CUDA intrinsic functions to reduce data read times during computations, thus enhancing the efficiency of the wavelet transform. The experiments demonstrate that our proposed wavelet-based method achieves an approximate 5x speedup in 3D on GPUs compared to the Fourier methods, resulting in an overall improvement of around 40% in the smoke-guided simulation.
  • Item
    TSDN: Transport-based Stylization for Dynamic NeRF
    (The Eurographics Association, 2024) Gong, Yuning; Song, Mingqing; Ren, Xiaohua; Liao, Yuanjun; Zhang, Yanci; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    While previous Neural Radiance Fields (NeRF) stylization methods achieve visually appealing results on transferring color style for static NeRF scenes, they lack the ability to stylize dynamic NeRF scenes with geometrically stylized features (like brushstrokes or feature elements from artists' works), which is also important for style transfer. However, directly stylizing each frame of dynamic NeRF independently with geometrically stylized features would lead to flickering results due to bad feature alignment. To overcome these problems, in this paper, we propose Transport-based Stylization for Dynamic NeRF (TSDN), a new dynamic NeRF stylization method that is able to stylize geometric features and align them with the motion in the scene. TSDN utilizes stylization guiding velocity fields to advect dynamic NeRF to get stylized results and then transfers these velocity fields between frames to maintain feature alignment. Also, to deal with the noisy stylized results due to the ambiguity of the deformation field, we propose a feature advection scheme and a novel regularization function specified for dynamic NeRF. The experiment results show that our method has the ability to stylize dynamic scenes with detailed geometrically stylized features from videos or multi-view image inputs, while preserving the original color style if desired. This capability is not present in previous video stylization methods.
  • Item
    LO-Gaussian: Gaussian Splatting for Low-light and Overexposure Scenes through Simulated Filter
    (The Eurographics Association, 2024) You, Jingjiao; Zhang, Yuanyang; Zhou, Tianchen; Zhao, Yecheng; Yao, Li; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Recent advancements in 3D Gaussian-based scene reconstruction and novel view synthesis have achieved impressive results. However, real-world images often suffer from adverse lighting conditions, which can hinder the performance of these techniques. Although progress has been made in addressing poor illumination, existing methods still struggle to accurately recover complex details in low-light and overexposed images. To address this challenge, we propose a method called LO-Gaussian, designed to recover illumination effectively in both low-light and overexposed scenes. Our approach involves simulating adverse lighting conditions during training, which is jointly optimized with the original 3D Gaussian rendering. During inference, the simulated filter is removed, allowing the model to decouple the scene under normal lighting conditions. We validate the effectiveness of our method through experiments on two publicly available datasets that include both poorly illuminated scenes and their corresponding normal illumination images. Experimental results demonstrate that LO-Gaussian consistently achieves optimal or near-optimal performance across these datasets, confirming the efficacy of our approach in illumination restoration.
  • Item
    Fast Approximation to Large-Kernel Edge-Preserving Filters by Recursive Reconstruction from Image Pyramids
    (The Eurographics Association, 2024) Xu, Tianchen; Yang, Jiale; Qin, Yiming; Sheng, Bin; Wu, Enhua; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Edge-preserving filters, as known as bilateral filters, are fundamental to graphics rendering techniques, providing greater generality and capability of edge preservation than pure convolution filters. However, sampling with a large kernel per pixel for these filters can be computationally intensive in real-time rendering. Existing acceleration methods for approximating edgepreserving filters still struggle to balance blur controllability, edge clarity, and runtime efficiency. In this paper, we propose a novel scheme for approximating edge-preserving filters with large anisotropic kernels by recursively reconstructing them from multi-image pyramid (MIP) layers that are weightedly filtered in a dual 3×3 kernel space. Our approach introduces a concise unified processing pipeline independent of kernel size, which includes upsampling and downsampling on MIP layers and enables the integration of custom edge-stopping functions. We also derive the implicit relations of the sampling weights and formulate a weight template model for inference. Furthermore, we convert the pipeline into a lightweight neural network for numerical solutions through data training. Consequently, our image post-processors achieve high-quality and high-performance edgepreserving filters in real-time, using the same control parameters as the original bilateral filters. These filters are applicable for depth-of-fields, global illumination denoising, and screen-space particle rendering. The simplicity of the reconstruction process in our pipeline makes it user-friendly and cost-effective, saving both runtime and implementation costs.
  • Item
    P-NLOS: A Prompt-Based Method for Robust NLOS Imaging
    (The Eurographics Association, 2024) Su, Xiongfei; Zhu, Tianyi; Liu, Lina; Zhang, Yuanlong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    The field of non-line-of-sight (NLOS) imaging is experiencing rapid advancement, offering the potential to reveal hidden scenes that are otherwise obscured from direct view. Despite this promise, NLOS systems face obstacles in managing a variety of sampling noise, as well as spatial and temporal variations, which limit their practical deployment. This paper introduces a novel strategy to overcome these challenges. It employs prompts to encode latent information, which is then leveraged to dynamically guide the NLOS reconstruction network. The proposed method, P-NLOS, consists of two branches: a reconstruction branch that handles the restoration of sampled information, and a prompting branch that captures the original information. The prompting branch supplies reliable content to the reconstruction branch, thereby enhancing the guidance of the reconstruction process and enhancing the quality of the recovered images. Overall, P-NLOS demonstrates robustness in real-world applications by effectively handling a wide range of corruption types in NLOS reconstruction tasks, including varying noise levels, diverse blur kernels, and temporal resolution variations.
  • Item
    A Contrastive Unified Encoding Framework for Sticker Style Editing
    (The Eurographics Association, 2024) Ni, Zhihong; Li, Chengze; Liu, Hanyuan; Liu, Xueting; Wong, Tien-Tsin; Wen, Zhenkun; Wu, Huisi; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Stickers are widely used in digital communication to enhance emotional and visual expressions. The conventional process of creating new sticker pack images involves time-consuming manual drawing, including meticulous color coordination and shading techniques for visual harmony. Learning the visual styles of distinct sticker packs would be critical to the overall process; however, existing solutions usually learn this style information within a limited number of style ''domains'', or per image. In this paper, we propose a contrastive learning framework that allows the style editing of an arbitrary sticker based on one or a number of style references with a continuous manifold to encapsulate all styles across sticker packs. The key to our approach is the encoding of styles into a unified latent space so that each sticker pack correlates with a unique style latent encoding. The contrastive loss ensures identical style latents within the same sticker pack, while distinct styles diverge. Through exposure to diverse sticker sets during training, our model crafts a consolidated continuous latent style space with strong expressive power, fostering seamless style transfer, interpolation, and mixing across sticker sets. Experiments show compelling style transfer results, with both qualitative and quantitative evaluations confirming the superiority of our method over existing approaches.
  • Item
    DViTGAN: Training ViTGANs with Diffusion
    (The Eurographics Association, 2024) Tong, Mengjun; Rao, Hong; Yang, Wenji; Chen, Shengbo; Zuo, Fang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Recent research findings indicate that injecting noise using diffusion can effectively improve the stability of GAN for image generation tasks. Although ViTGAN based on Vision Transformer has certain performance advantages compared to traditional GAN, there are still issues such as unstable training and generated image details are not rich enough. Therefore, in this paper, we propose a novel model, DViTGAN, which leverages the diffusion model to generate instance noise facilitating ViTGAN training. Specifically, we employ forward diffusion to progressively generate noise that follows a Gaussian mixture distribution, and then introduce the generated noise into the input image of the discriminator. The generator incorporates the discriminator's feedback by backpropagating through the forward diffusion process to improve its performance. In addition, we observe that the ViTGAN generator lacks positional information, leading to a decreased context modeling ability and slower convergence. To this end, we introduce Fourier embedding and relative positional encoding to enhance the model's expressive ability. Experiments on multiple popular benchmarks have demonstrated the effectiveness of our proposed model.
  • Item
    Computational Mis-Drape Detection and Rectification
    (The Eurographics Association, 2024) Shin, Hyeon-Seung; Ko, Hyeong-Seok; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    For various reasons, mis-drapes occur in physically-based clothing simulation. Therefore, when developing a virtual try-on system that works without any human operators, a technique to algorithmically detect and rectify mis-drapes has to be developed. This paper makes a first attempt in that direction, by defining two mis-drape determinants, namely, the Gaussian and crease mis-drape determinants. According to the experiments performed to various avatar-garment combinations, the proposed determinants identify mis-drapes pretty accurately. This paper also proposes a treatment that can be applied to rectify the mis-drapes. The proposed treatment successfully resolves the mis-drapes without unnecessarily destroying the original drape.
  • Item
    Self-Supervised Multi-Layer Garment Animation Generation Network
    (The Eurographics Association, 2024) Han, Guoqing; Shi, Min; Mao, Tianlu; Wang, Xinran; Zhu, Dengming; Gao, Lin; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    This paper presents a self-supervised multi-layer garment animation generation network. The complexity inherent in multi-layer garments, particularly the diverse interactions between layers, poses challenges in generating continuous, stable, physically accurate, and visually realistic garment deformation animations. To tackle these challenges, we present the Self-Supervised Multi-Layer Garment Animation Generation Network (SMLN). The architecture of SMLN is based on graph neural networks, which represents garment models uniformly as graph structures, thereby naturally depicting the hierarchical structure of garments and capturing the relationships between garment layers. Unlike existing multi-layer garment deformation methods, we model interaction forces such as friction and repulsion between garment layers, translating physical laws consistent with dynamics into network constraints. We penalize garment deformation regions that exceed these constraints. Furthermore, instead of the traditional post-processing method of fixed vertex displacement calculation for handling collision interactions, we add an additional repulsion constraint layer within the network to update the corresponding repulsive force acceleration, thereby adaptively managing collisions between garment layers. Our self-supervised modeling approach enables the network to learn without relying on garment sample datasets. Experimental results demonstrate that our method is capable of generating visually plausible multi-layer garment deformation effects, surpassing existing methods in both visual quality and evaluation metrics.
  • Item
    SLGDiffuser : Stroke-level Guidance Diffusion Model for Complex Scene Text Editing
    (The Eurographics Association, 2024) Liu, Xiao Le; Wu, Lei; Wang, Chang Shuo; Dong, Pei; Meng, Xiang Xu; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Scene Text Editing (STE) focuses on replacing text in images while preserving style and background. Existing methods often grapple with simultaneously learning different transformation rules for text and background, especially in complex scenes. This leads to several notable challenges, such as low accuracy in content, ineffective extraction of text styles, and suboptimal background reconstruction. To address these challenges, we introduce SLGDiffuser, a stroke-level guidance diffusion model specifically designed for complex scene text editing. SLGDiffuser features a stroke-level guidance text conversion module that processes target text through character encoding and utilizes ContourLoss with stroke features to improve text accuracy. It also benefits from the proposed stroke-enhanced strategy, which enhances text integrity by leveraging detailed stroke information. Furthermore, we introduce a unified instruction-based background reconstruction module that fine-tunes a pre-trained diffusion model. It enables the application of a standardized instruction prompt to reconstruct a variety of complex scenes effectively. Tested extensively, our model outperforms existing methods across diverse real-world datasets. We release code and model weights at https://github.com/lxlde/SLGDiffuser
  • Item
    Modeling Sketches both Semantically and Structurally for Zero-Shot Sketch-Based Image Retrieval is Better
    (The Eurographics Association, 2024) Jing, Jiansen; Liu, Yujie; Li, Mingyue; Xiao, Qian; Chai, Shijie; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Sketch, as a representation of human thought, is abstract but also structured because it is presented as a two-dimensional image. Therefore, modeling it from semantic and structural perspectives is reasonable and effective. In this paper, for the semantic capturing, we compare the performance of two mainstream pre-trained models on the Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) task and propose a new model, Semantic Net (SNET), based on Contrastive Language-Image Pre-training (CLIP) with a more effective fine-tuning strategy and a Semantic Preservation Module. Furthermore, we propose three lightweight modules, Channels Fusion (CF), Layers Fusion (LF), and Semantic Structure Fusion (SSF) to endow SNET with the ability of stronger structure capture. Finally, we supervise the entire training process by a classification loss based on contrastive learning and bidirectional triplet loss based on cosine distance metric. We call the final version model Semantic Structure Net (SSNET). The quantitative experimental results show that both our proposed SNET and the enhanced version SSNET achieve the new SOTA (16% retrieval boost on the most difficult QuickDraw Ext dataset). The visualization experiments also prove our thinking on sketch modeling from the side.
  • Item
    Editing Compact Voxel Representations on the GPU
    (The Eurographics Association, 2024) Molenaar, Mathijs; Eisemann, Elmar; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    A Sparse Voxel Directed Acyclic Graph (SVDAG) is an efficient representation to display and store a highly-detailed voxel representation in a very compact data structure. Yet, editing such a high-resolution scene in real-time is challenging. Existing solutions are hybrid, involving the CPU, and are restricted to small local modifications. In this work, we address this bottleneck and propose a solution to perform edits fully on the graphics card, enabled by dynamic GPU hash tables. Our framework makes large editing operations possible, such as 3D painting, at real-time frame rates.
  • Item
    DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping
    (The Eurographics Association, 2024) Cai, Zeyu; Wang, Duotun; Liang, Yixun; Shao, Zhijing; Chen, Ying-Cong; Zhan, Xiaohang; Wang, Zeyu; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Score Distillation Sampling (SDS) has emerged as a prevalent technique for text-to-3D generation, enabling 3D content creation by distilling view-dependent information from text-to-2D guidance. However, they frequently exhibit shortcomings such as over-saturated color and excess smoothness. In this paper, we conduct a thorough analysis of SDS and refine its formulation, finding that the core design is to model the distribution of rendered images. Following this insight, we introduce a novel strategy called Variational Distribution Mapping (VDM), which expedites the distribution modeling process by regarding the rendered images as instances of degradation from diffusion-based generation. This special design enables the efficient training of variational distribution by skipping the calculations of the Jacobians in the diffusion U-Net. We also introduce timestep-dependent Distribution Coefficient Annealing (DCA) to further improve distilling precision. Leveraging VDM and DCA, we use Gaussian Splatting as the 3D representation and build a text-to-3D generation framework. Extensive experiments and evaluations demonstrate the capability of VDM and DCA to generate high-fidelity and realistic assets with optimization efficiency.
  • Item
    High-Quality Cage Generation Based on SDF
    (The Eurographics Association, 2024) Qiu, Hao; Liao, Wentao; Chen, Renjie; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Cages are widely used in various applications of computer graphics, including physically-based rendering, shape deformation, physical simulation, etc. Given an input shape, we present an efficient and robust method for the automatic construction of high quality cage. Our method follows the envelope-and-simplify paradigm. In the enveloping stage, an isosurface enclosing the model is extracted from the signed distance field (SDF) of the shape. By leveraging the versatility of SDF, we propose a straightforward modification to SDF that enables the resulting isosurface to have better topological structure and capture the details of the shape well. In the simplification stage, we use the quadric error metric to simplify the isosurface and construct a cage, while rigorously ensuring the cage remains enclosing and does not self-intersect. We propose to further optimize various qualities of the cage for different applications, including distance to the original mesh and meshing quality. The cage generated by our method is guaranteed to be strictly enclosing the input shape, free of self-intersection, has the user-specified complexity and provides a good approximation to the input, as required by various applications. Through extensive experiments, we demonstrate that our method is robust and efficient for a wide variety of shapes with complex geometry and topology.
  • Item
    GGAvatar: Dynamic Facial Geometric Adjustment for Gaussian Head Avatar
    (The Eurographics Association, 2024) Li, Xinyang; Wang, Jiaxin; Xuan, Yixin; Yao, Gongxin; Pan, Yu; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Reconstructing animatable 3D head avatars from target subject videos has long been a significant challenge and a hot topic in computer graphics. This paper proposes GGAvatar, a novel 3D avatar representation designed to robustly model dynamic head avatars with complex identities and deformations. GGAvatar employs a coarse-to-fine structure, featuring two core modules: a Neutral Gaussian Initialization Module and a Geometry Morph Adjuster. The Neutral Gaussian Initialization Module pairs Gaussian primitives with deformable triangular meshes, using an adaptive density control strategy to model the geometric structure of the target subject with neutral expressions. The Geometry Morph Adjuster introduces deformation bases for each Gaussian in global space, creating fine-grained low-dimensional representations of deformations to overcome the limitations of the Linear Blend Skinning formula. Extensive experiments show that GGAvatar can produce high-fidelity renderings, outperforming state-of-the-art methods in visual quality and quantitative metrics.
  • Item
    Enhancing Human Optical Flow via 3D Spectral Prior
    (The Eurographics Association, 2024) Mao, Shiwei; Sun, Mingze; Huang, Ruqi; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    In this paper, we consider the problem of human optical flow estimation, which is critical in a series of human-centric computer vision tasks. Recent deep learning-based optical flow models have achieved considerable accuracy and generalization by incorporating various kinds of priors. However, the majority either rely on large-scale 2D annotations or rigid priors, overlooking the 3D non-rigid nature of human articulations. To this end, we advocate enhancing human optical flow estimation via 3D spectral prior-aware pretraining, which is based on the well-known functional maps formulation in 3D shape matching. Our pretraining can be performed with synthetic human shapes. More specifically, we first render shapes to images and then leverage the natural inclusion maps from images to shapes to lift 2D optical flow into 3D correspondences, which are further encoded as functional maps. Such lifting operation allows to inject the intrinsic geometric features encoded in the spectral representations into optical flow learning, leading to improvement of the latter, especially in the presence of non-rigid deformations. In practice, we establish a pretraining pipeline tailored for triangular meshes, which is general regarding target optical flow network. It is worth noting that it does not introduce any additional learning parameters but only require some pre-computed eigen decomposition on the meshes. For RAFT and GMA, our pretraining task achieves improvements of 12.8% and 4.9% in AEPE on the SHOF benchmark, respectively.
  • Item
    GazeMoDiff: Gaze-guided Diffusion Model for Stochastic Human Motion Prediction
    (The Eurographics Association, 2024) Yan, Haodong; Hu, Zhiming; Schmitt, Syn; Bulling, Andreas; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Human motion prediction is important for many virtual and augmented reality (VR/AR) applications such as collision avoidance and realistic avatar generation. Existing methods have synthesised body motion only from observed past motion, despite the fact that human eye gaze is known to correlate strongly with body movements and is readily available in recent VR/AR headsets. We present GazeMoDiff - a novel gaze-guided denoising diffusion model to generate stochastic human motions. Our method first uses a gaze encoder and a motion encoder to extract the gaze and motion features respectively, then employs a graph attention network to fuse these features, and finally injects the gaze-motion features into a noise prediction network via a cross-attention mechanism to progressively generate multiple reasonable human motions in the future. Extensive experiments on the MoGaze and GIMO datasets demonstrate that our method outperforms the state-of-the-art methods by a large margin in terms of multi-modal final displacement error (17.3% on MoGaze and 13.3% on GIMO). We further conducted a human study (N=21) and validated that the motions generated by our method were perceived as both more precise and more realistic than those of prior methods. Taken together, these results reveal the significant information content available in eye gaze for stochastic human motion prediction as well as the effectiveness of our method in exploiting this information.
  • Item
    GamePose: Self-Supervised 3D Human Pose Estimation from Multi-View Game Videos
    (The Eurographics Association, 2024) Zhou, Yang; Guo, Tianze; Xu, Hao; Wei, Xilei; Xu, Lang; Tang, Xiangjun; Yang, Sipeng; Kou, Qilong; Jin, Xiaogang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Recovering 3D character animations from published games is crucial when original animation assets are lost. One solution for recovering such animation assets is to use 3D human pose estimation with single or multiple views. Our insight is to preserve the ease of use of single-view estimation while enhancing its accuracy by leveraging information from multi-view videos. It is a difficult task that requires explicitly modelling the correlation of multi-view input to achieve superior accuracy and converting the multi-view correlation model to a single-view model without impacting the accuracy, which both are unresolved. To this end, we propose a novel self-supervised 3D pose estimation framework that models the correlation of multi-view input during training and can predict highly accurate estimation for single-view input. Our framework consists of two main components: the Single-View Module (SM) and the Cross-View Module (CM). The SM predicts approximate 3D poses and extracts features from a single viewpoint, while the CM enhances the learning process by modelling correlations across multiple viewpoints. This design facilitates effective self-distillation, improving the accuracy of single-view estimations. As a result, our method supports highly accurate inference with both multi-view data and single-view data. We validate our method on 3D human pose estimation benchmarks and create a new dataset using Mixamo assets to demonstrate its applicability in gaming scenarios. Extensive experiments show that our approach outperforms state-of-the-art methods in self-supervised learning scenarios.
  • Item
    Feature Separation Graph Convolutional Networks for Skeleton-Based Action Recognition
    (The Eurographics Association, 2024) Zhang, Lingyan; Ling, Wanyu; Daizhou, Shuwen; Kuang, Li; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Graph Convolutional Networks have made significant advancements in skeleton-based action recognition. However, most existing methods process body features globally, overlooking the challenges posed by partial visual occlusion, which severely impair the model's recognition capabilities when body parts are obscured. To address this issue, we propose Feature Separation Graph Convolutional Networks (FS-GCN), consisting of Feature Separation Modeling (FSM) and Exchange Modeling (EM). FSM strategically separates the skeleton feature into essential body parts, placing emphasis on upper body features while seamlessly integrating lower body features. This allows FS-GCN to better capture the distinctive spatial and temporal characteristics associated with each body segment. EM facilitates the swapping of body half correlation matrices between different graph convolution modules, eliminating discrepancies and enabling a more robust and unified global information processing framework. Furthermore, FS-GCN divides the adaptive graph into two key parts for graph contrastive learning to extract more intra-class contrastive information during training process. FS-GCN achieves state-of-the-art performance on NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets, especially in line-of-sight-obstructed scenarios.
  • Item
    High-Quality Geometry and Texture Editing of Neural Radiance Field
    (The Eurographics Association, 2024) Kim, Soongjin; Son, Jooeun; Ju, Gwangjin; Lee, Joo Ho; Lee, Seungyong; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Recent advances in Neural Radiance Field (NeRF) have demonstrated impressive rendering quality reconstructed from input images. However, the density-based radiance field representation introduces entanglement of geometry and texture, limiting the editability. To address this issue, NeuMesh proposed a mesh-based NeRF editing method supporting deformation and texture editing. Still, it fails reconstructing and rendering fine details of input images, and the dependency between rendering scheme and geometry limits editability for target scenes. In this paper, we propose an intermediate scene representation where a near-surface volume is associated with the guide mesh. Our key idea is separating a given scene into geometry, parameterized texture space, and radiance field. We define a mapping between GHI-coordinate space and DE-coordinate system defined by combination of mesh parameterization and the height from mesh surface to efficiently encode the near-surface volume. With the surface-aligned radiance field defined in the near-surface volume, our method can generate high quality rendering results with high frequency details. Our method also supports various geometry and appearance editing operations while preserving high rendering quality. We demonstrate the performance of our method by comparing it with the state-of-the-art methods both qualitatively and quantitatively and show its applications including shape deformation, texture filling, and texture painting.
  • Item
    3D-SSGAN: Lifting 2D Semantics for 3D-Aware Compositional Portrait Synthesis
    (The Eurographics Association, 2024) Liu, Ruiqi; Zheng, Peng; Wang, Ye; Ma, Rui; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Existing 3D-aware portrait synthesis methods can generate impressive high-quality images while preserving strong 3D consistency. However, most of them cannot support the fine-grained part-level control over synthesized images. Conversely, some GAN-based 2D portrait synthesis methods can achieve clear disentanglement of facial regions, but they cannot preserve view consistency due to a lack of 3D modeling abilities. To address these issues, we propose 3D-SSGAN, a novel framework for 3D-aware compositional portrait image synthesis. First, a simple yet effective depth-guided 2D-to-3D lifting module maps the generated 2D part features and semantics to 3D. Then, a volume renderer with a novel 3D-aware semantic mask renderer is utilized to produce the composed face features and corresponding masks. The whole framework is trained end-to-end by discriminating between real and synthesized 2D images and their semantic masks. Quantitative and qualitative evaluations demonstrate the superiority of 3D-SSGAN in controllable part-level synthesis while preserving 3D view consistency.
  • Item
    3DStyleGLIP: Part-Tailored Text-Guided 3D Neural Stylization
    (The Eurographics Association, 2024) Chung, SeungJeh; Park, JooHyun; Kang, HyeongYeop; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    3D stylization, the application of specific styles to three-dimensional objects, offers substantial commercial potential by enabling the creation of uniquely styled 3D objects tailored to diverse scenes. Recent advancements in artificial intelligence and textdriven manipulation methods have made the stylization process increasingly intuitive and automated. While these methods reduce human costs by minimizing reliance on manual labor and expertise, they predominantly focus on holistic stylization, neglecting the application of desired styles to individual components of a 3D object. This limitation restricts the fine-grained controllability. To address this gap, we introduce 3DStyleGLIP, a novel framework specifically designed for text-driven, parttailored 3D stylization. Given a 3D mesh and a text prompt, 3DStyleGLIP utilizes the vision-language embedding space of the Grounded Language-Image Pre-training (GLIP) model to localize individual parts of the 3D mesh and modify their appearance to match the styles specified in the text prompt. 3DStyleGLIP effectively integrates part localization and stylization guidance within GLIP's shared embedding space through an end-to-end process, enabled by part-level style loss and two complementary learning techniques. This neural methodology meets the user's need for fine-grained style editing and delivers high-quality part-specific stylization results, opening new possibilities for customization and flexibility in 3D content creation. Our code and results are available at https://github.com/sj978/3DStyleGLIP.
  • Item
    CKD-LQPOSE: Towards a Real-World Low-quality Cross-Task Distilled Pose Estimation Architecture
    (The Eurographics Association, 2024) Liu, Tao; Yao, Beiji; Huang, Jun; Wang, Ya; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Although human pose estimation (HPE) methods have achieved promising results, they remain challenging in real-world lowquality (LQ) scenarios. Moreover, due to the general lack of modeling of LQ information in currently public HPE datasets, it is difficult to accurately evaluate the performance of the HPE methods in LQ scenarios. Hence, we propose a novel CKD-LQPose architecture, which is the first architecture fusing cross-task feature information in HPE that uses a cross-task distillation method to merge HPE information and well-quality (WQ) information. The CKD-LQPose architecture effectively enables adaptive feature learning from LQ images and improves their quality to enhance HPE performance. Additionally, we introduce the PatchWQ-Gan module to obtain WQ information and the refined transformer decoder (RTD) module to refine the features further. In the inference stage, CKD-LQPose removes the PatchWQ-Gan and RTD modules to reduce the computational burden. Furthermore, to accurately evaluate the HPE methods in LQ scenarios, we develop an RLQPose-DS test benchmark. Extensive experiments on RLQPose-DS, real-world images, and LQ versions of well-known datasets such as COCO, MPII, and CrowdPose demonstrate CKD-LQPose outperforms state-of-the-art approaches by a large margin, demonstrating its effectiveness in realworld LQ scenarios.
  • Item
    Colorectal Protrusions Detection based on Conformal Colon Flattening
    (The Eurographics Association, 2024) Ren, Yuxue; Hu, Wei; Li, Zhengbin; Chen, Wei; Lei, Na; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    We propose an innovative approach to automatically detect colorectal protrusions on the colon surface. In the colon, these protrusions include polyps. This approach comprises two successive stages. In the first stage, we identify single protrusions and extract folds containing suspected protrusions in the flattened colon image by integrating shape analysis with curvature rendering and conformal colon flattening. This stage enables accurate and rapid detection of single protrusions, especially flat protrusions, since the 3D protrusion detection problem is converted into a 2D pattern recognition problem. To detect protrusions on folds, the folds containing suspected protrusions is inversely mapped back to 3D colon surface in the second stage. We detect protrusions in the 3D surface area by curvature-based analysis and reduce the false positives by quadratic surface fitting. We evaluated our method via real colon data from the National CT Colonography Trial of the American College of Radiology Imaging Network (ACRIN, 6664). Experimental results show that our method can efficiently and accurately identify protrusion lesions, is robust to noise, and is suitable for implementation within CTC-CAD systems.
  • Item
    A Fiber Image Classification Strategy Based on Key Module Localization
    (The Eurographics Association, 2024) Ji, Ya Tu; Xue, Xiang; Liu, Yang; Xu, H. T.; Ren, Q. D. E. J.; Shi, B.; Wu, N. E.; Lu, M.; Xu, X. X.; Wang, L.; Dai, L. J.; Yao, M. M.; Li, X. M.; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Traditional image classification approach divides the fiber image into several non overlapping patches during the embedding stage. However, for fine-grained image data, this rough method makes the model lack the ability to model locally within each patch. In addition, the overall proportion of fiber features is always small and densely distributed, and irrelevant interference noise occupies the vast majority of the image. Therefore, this paper proposes a strategy to address the above issues. Firstly, ResNeXt-50 is used to obtain prior information such as inductive bias and translation invariance. Then, by introducing a lightweight Coordinate Attention, focus is achieved on the inside of the fibers rather than background information. Finally, this information is used as input to the Grad-CAM module to accurately identify the fiber interior regions of interest. The proposed approach has significant advantages over multiple strong baseline models on the test data provided by the National Fiber Quality Testing Center, as it can effectively learn fiber skeleton features and achieve finer grained modeling.
  • Item
    Img2PatchSeqAD: Industrial Image Anomaly Detection Based on Image Patch Sequence
    (The Eurographics Association, 2024) Liu, Yang; Ji, Ya Tu; Xue, Xiang; Xu, H. T.; Ren, Qing Dao Er Ji; Shi, Bao; Wu, N. E.; Lu, M.; Xu, Xuan Xuan; Guo, H. X.; Wang, L.; Dai, L. J.; Yao, Miao Miao; Li, Xiao Mei; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    In the domain of industrial Visual Anomaly Detection(VAD), methods based on image reconstruction are the most popular and successful approaches. However, current image reconstruction methods rely on global image information, which proves to be both blind and inefficient for anomaly detection tasks. Our approach tackles these limitations by taking advantage of neighboring image patches to assess the presence of anomalies in the current image and then selectively reconstructing those patches. In this paper, we introduce a novel architecture for image anomaly detection, named Img2PatchSeqAD. Specifically, we employ a row-wise scanning method to construct sequences of image patches and design a network framework based on an image patch sequence encoder-decoder structure. Additionally, we utilize the KAN model and ELA attention mechanism to develop methods for image patch vectorization and establish an image reconstruction pipeline. Experimental results on the MVTec-AD and VisA datasets demonstrate the effectiveness of our approach, achieving localization and detection scores of 81.3 (AUROC) and 91.9 (AP) on the multi-class MVTec-AD dataset.
  • Item
    ``Yunluo Journey'': A VR Cultural experience for the Chinese Musical Instrument
    (The Eurographics Association, 2024) Wang, Yuqiu; Guo, Wenchen; He, Zhiting; Fan, Min; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    The sustainability of the cultural heritage of traditional musical instruments requires integrating musical culture into people's daily lives. However, the Yunluo, a traditional Chinese musical instrument, is too large and expensive to be easily incorporated into everyday life. To promote the sustainability and dissemination of Yunluo culture, we designed a VR Yunluo cultural experience that allows people to engage in the creation and performance of Yunluo, as well as learn about its historical and cultural significance through a Yunluo experience. This embodied, gamified, and contextualized VR experience aims to enhance participants' interest in Yunluo culture and improve their understanding and appreciation of the related knowledge.
  • Item
    Simulating Viscous Fluid Using Free Surface Lattice Boltzmann Method
    (The Eurographics Association, 2024) Sun, Dakun; Gao, Yang; Xie, Xueguang; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    High viscosity fluid simulation remains a significant area of interest within the graphics field. However, there are few discussions about simulating viscous fluids in computer graphics with the Lattice Boltzmann Method (LBM). In this study, we demonstrate the feasibility of using LBM for viscous fluid simulation and show a caveat regarding external forces. Previous methods (such as FLIP, MPM, SPH) on viscous fluids are mainly based on Navier-Stokes (NS) Equation, where the external forces are independent from viscosity in governing equation. Therefore, the decision to neglect the external force solely depends on its magnitude. However, in the context of the Lattice Boltzmann Equation (LBE), external forces are intertwined with viscosity within the collision term, making the choice to ignore the external force term dependent on both the viscosity and the force's magnitude. It has not been mentioned in previous study and we will show its importance by comparison experiments.
  • Item
    SPDD-YOLO for Small Object Detection in UAV Images
    (The Eurographics Association, 2024) Xue, Xiang; Ji, Ya Tu; Liu, Yang; Xu, H. T.; Ren, Q. D. E. J.; Shi, B.; Wu, N. E.; Lu, M.; Zhuang, X. F.; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Aerial images captured by drones often suffer from blurriness and low resolution, which is particularly problematic for small targets. In such scenarios, the YOLO object detection algorithm tends to confuse or misidentify targets like bicycles and tricycles due to the complex features and local similarities. To address these issues, this paper proposes a SPDD-YOLO model based on YOLOv8. Firstly, the model enhances its ability to extract local features of small targets by introducing the Spatial-to- Depth Module (SPDM). Secondly, addressing the issue that SPDM reduces the receptive field, leading the model to overly focus on local features, we introduced Deep Separable Dilated Convolution (DSDC), which expands the receptive field while reducing parameters and forms the Deep Dilated Module (DDM) together with SPDM. Experiments on the VisDrone2019 dataset demonstrate that the proposed model improved precision, recall, and mAP50 by 5.8%, 5.7%, and 6.4%, respectively.
  • Item
    Self-Supervised Multi-Layer Garment Animation Generation Network
    (The Eurographics Association, 2024) Han, Guoqing; Shi, Min; Mao, Tianlu; Wang, Xinran; Zhu, Dengming; Gao, Lin; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    This paper presents a self-supervised multi-layer garment animation generation network. The complexity inherent in multi-layer garments, particularly the diverse interactions between layers, poses challenges in generating continuous, stable, physically accurate, and visually realistic garment deformation animations. To tackle these challenges, we present the Self-Supervised Multi-Layer Garment Animation Generation Network (SMLN). The architecture of SMLN is based on graph neural networks, which represents garment models uniformly as graph structures, thereby naturally depicting the hierarchical structure of garments and capturing the relationships between garment layers. Unlike existing multi-layer garment deformation methods, we model interaction forces such as friction and repulsion between garment layers, translating physical laws consistent with dynamics into network constraints. We penalize garment deformation regions that exceed these constraints. Furthermore, instead of the traditional post-processing method of fixed vertex displacement calculation for handling collision interactions, we add an additional repulsion constraint layer within the network to update the corresponding repulsive force acceleration, thereby adaptively managing collisions between garment layers. Our self-supervised modeling approach enables the network to learn without relying on garment sample datasets. Experimental results demonstrate that our method is capable of generating visually plausible multi-layer garment deformation effects, surpassing existing methods in both visual quality and evaluation metrics.
  • Item
    CNCUR : A simple 2D Curve Reconstruction Algorithm based on constrained neighbours
    (The Eurographics Association, 2024) Antony, Joms; Reghunath, Minu; Muthuganapathy, Ramanathan; Chen, Renjie; Ritschel, Tobias; Whiting, Emily
    Given a planar point set S ∈ R2 (where S = {v1, . . . , vn}) sampled from an unknown curve Σ, the goal is to obtain a piece-wise linear reconstruction the curve from S that best approximates Σ. In this work, we propose a simple and intuitive Delaunay triangulation(DT)-based algorithm for curve reconstruction. We start by constructing a Delaunay Triangulation (DT) of the input point set. Next, we identify the set of edges, ENp in the natural neighborhood of each point p in the DT. From the set of edges in ENp, we retain the first two shorter edges connected to each point. To take care of open curves, one of the retained edges has to be removed based on a parameter δ. Here, δ is a parameter used to eliminate the longer edge based on the allowable ratio between the maximum and minimum edge lengths. Our algorithm inherently handles self-intersections, multiple components, sharp corners, and different levels of Gaussian noise, all without requiring any parameters, pre-processing, or post-processing.