EG 2025 - Full Papers - CGF 44-Issue 2

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/3607136

Browse

Now showing 1 - 20 of 75

StyleBlend: Enhancing Style-Specific Content Creation in Text-to-Image Diffusion Models
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Chen, Zichong; Wang, Shijin; Zhou, Yang; Bousseau, Adrien; Day, Angela
Synthesizing visually impressive images that seamlessly align both text prompts and specific artistic styles remains a significant challenge in Text-to-Image (T2I) diffusion models. This paper introduces StyleBlend, a method designed to learn and apply style representations from a limited set of reference images, enabling content synthesis of both text-aligned and stylistically coherent. Our approach uniquely decomposes style into two components, composition and texture, each learned through different strategies. We then leverage two synthesis branches, each focusing on a corresponding style component, to facilitate effective style blending through shared features without affecting content generation. StyleBlend addresses the common issues of text misalignment and weak style representation that previous methods have struggled with. Extensive qualitative and quantitative comparisons demonstrate the superiority of our approach.
Neural Geometry Processing via Spherical Neural Surfaces
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Williamson, Romy; Mitra, Niloy J.; Bousseau, Adrien; Day, Angela
Neural surfaces (e.g., neural map encoding, deep implicit, and neural radiance fields) have recently gained popularity because of their generic structure (e.g., multi-layer perceptron) and easy integration with modern learning-based setups. Traditionally, we have a rich toolbox of geometry processing algorithms designed for polygonal meshes to analyze and operate on surface geometry. Without an analogous toolbox, neural representations are typically discretized and converted into a mesh, before applying any geometry processing algorithm. This is unsatisfactory and, as we demonstrate, unnecessary. In this work, we propose a spherical neural surface representation for genus-0 surfaces and demonstrate how to compute core geometric operators directly on this representation. Namely, we estimate surface normals and first and second fundamental forms of the surface, as well as compute surface gradient, surface divergence and Laplace Beltrami operator on scalar/vector fields defined on the surface. Our representation is fully seamless, overcoming a key limitation of similar explicit representations such as Neural Surface Maps [MAKM21]. These operators, in turn, enable geometry processing directly on the neural representations without any unnecessary meshing. We demonstrate illustrative applications in (neural) spectral analysis, heat flow and mean curvature flow, and evaluate robustness to isometric shape variations. We propose theoretical formulations and validate their numerical estimates, against analytical estimates, mesh-based baselines, and neural alternatives, where available. By systematically linking neural surface representations with classical geometry processing algorithms, we believe this work can become a key ingredient in enabling neural geometry processing. Code is available via the project webpage.
Lipschitz Pruning: Hierarchical Simplification of Primitive-Based SDFs
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Barbier, Wilhem; Sanchez, Mathieu; Paris, Axel; Michel, Élie; Lambert, Thibaud; Boubekeur, Tamy; Paulin, Mathias; Thonat, Theo; Bousseau, Adrien; Day, Angela
Rendering tree-based analytical Signed Distance Fields (SDFs) through sphere tracing often requires to evaluate many primitives per tracing step, for many steps per pixel of the end image. This cost quickly becomes prohibitive as the number of primitives that constitute the SDF grows. In this paper, we alleviate this cost by computing local pruned trees that are equivalent to the full tree within their region of space while being much faster to evaluate. We introduce an efficient hierarchical tree pruning method based on the Lipschitz property of SDFs, which is compatible with hard and smooth CSG operators. We propose a GPU implementation that enables real-time sphere tracing of complex SDFs composed of thousands of primitives with dynamic animation. Our pruning technique provides significant speedups for SDF evaluation in general, which we demonstrate on sphere tracing tasks but could also lead to significant improvement for SDF discretization or polygonization.
Deformed Tiling and Blending: Application to the Correction of Distortions Implied by Texture Mapping
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wendling, Quentin; Ravaglia, Joris; Sauvage, Basile; Bousseau, Adrien; Day, Angela
The prevailing model in virtual 3D scenes is a 3D surface, which a texture is mapped onto, through a parameterization from the texture plane. We focus on accounting for the parameterization during the texture creation process, to control the deformations and remove the cuts induced by the mapping. We rely on the tiling and blending, a real-time and parallel algorithm that generates an arbitrary large texture from a small input example. Our first contribution is to enhance the tiling and blending with a deformation field, which controls smooth spatial variations in the texture plane. Our second contribution is to derive, from a parameterized triangle mesh, a deformation field to compensate for texture distortions and to control for the texture orientation. Our third contribution is a technique to enforce texture continuity across the cuts, thanks to a proper tile selection. This opens the door to interactive sessions with artistic control, and real-time rendering with improved visual quality.
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kappel, Moritz; Hahlbohm, Florian; Scholz, Timon; Castillo, Susana; Theobalt, Christian; Eisemann, Martin; Golyanik, Vladislav; Magnor, Marcus; Bousseau, Adrien; Day, Angela
Dynamic reconstruction and spatiotemporal novel-view synthesis of non-rigidly deforming scenes recently gained increased attention. While existing work achieves impressive quality and performance on multi-view or teleporting camera setups, most methods fail to efficiently and faithfully recover motion and appearance from casual monocular captures. This paper contributes to the field by introducing a new method for dynamic novel view synthesis from monocular video, such as casual smartphone captures. Our approach represents the scene as a dynamic neural point cloud, an implicit time-conditioned point distribution that encodes local geometry and appearance in separate hash-encoded neural feature grids for static and dynamic regions. By sampling a discrete point cloud from our model, we can efficiently render high-quality novel views using a fast differentiable rasterizer and neural rendering network. Similar to recent work, we leverage advances in neural scene analysis by incorporating data-driven priors like monocular depth estimation and object segmentation to resolve motion and depth ambiguities originating from the monocular captures. In addition to guiding the optimization process, we show that these priors can be exploited to explicitly initialize our scene representation to drastically improve optimization speed and final image quality. As evidenced by our experimental evaluation, our dynamic point cloud model not only enables fast optimization and real-time frame rates for interactive applications, but also achieves competitive image quality on monocular benchmark sequences. Our code and data are available online https://moritzkappel.github.io/projects/dnpc/.
Versatile Physics-based Character Control with Hybrid Latent Representation
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bae, Jinseok; Won, Jungdam; Lim, Donggeun; Hwang, Inwoo; Kim, Young Min; Bousseau, Adrien; Day, Angela
We present a versatile latent representation that enables physically simulated character to efficiently utilize motion priors. To build a powerful motion embedding that is shared across multiple tasks, the physics controller should employ rich latent space that is easily explored and capable of generating high-quality motion. We propose integrating continuous and discrete latent representations to build a versatile motion prior that can be adapted to a wide range of challenging control tasks. Specifically, we build a discrete latent model to capture distinctive posterior distribution without collapse, and simultaneously augment the sampled vector with the continuous residuals to generate high-quality, smooth motion without jittering. We further incorporate Residual Vector Quantization, which not only maximizes the capacity of the discrete motion prior, but also efficiently abstracts the action space during the task learning phase. We demonstrate that our agent can produce diverse yet smooth motions simply by traversing the learned motion prior through unconditional motion generation. Furthermore, our model robustly satisfies sparse goal conditions with highly expressive natural motions, including head-mounted device tracking and motion in-betweening at irregular intervals, which could not be achieved with existing latent representations.
Does 3D Gaussian Splatting Need Accurate Volumetric Rendering?
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Celarek, Adam; Kopanas, Georgios; Drettakis, George; Wimmer, Michael; Kerbl, Bernhard; Bousseau, Adrien; Day, Angela
Since its introduction, 3D Gaussian Splatting (3DGS) has become an important reference method for learning 3D representations of a captured scene, allowing real-time novel-view synthesis with high visual quality and fast training times. Neural Radiance Fields (NeRFs), which preceded 3DGS, are based on a principled ray-marching approach for volumetric rendering. In contrast, while sharing a similar image formation model with NeRF, 3DGS uses a hybrid rendering solution that builds on the strengths of volume rendering and primitive rasterization. A crucial benefit of 3DGS is its performance, achieved through a set of approximations, in many cases with respect to volumetric rendering theory. A naturally arising question is whether replacing these approximations with more principled volumetric rendering solutions can improve the quality of 3DGS. In this paper, we present an in-depth analysis of the various approximations and assumptions used by the original 3DGS solution. We demonstrate that, while more accurate volumetric rendering can help for low numbers of primitives, the power of efficient optimization and the large number of Gaussians allows 3DGS to outperform volumetric rendering despite its approximations.
VRSurf: Surface Creation from Sparse, Unoriented 3D Strokes
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Sureshkumar, Anandhu; Parakkat, Amal Dev; Bonneau, Georges-Pierre; Hahmann, Stefanie; Cani, Marie-Paule; Bousseau, Adrien; Day, Angela
Although intuitive, sketching a closed 3D shape directly in an immersive environment results in an unordered set of arbitrary strokes, which can be difficult to assemble into a closed surface. We tackle this challenge by introducing VRSurf, a surfacing method inspired by a balloon inflation metaphor: Seeded in the sparse scaffold formed by the strokes, a smooth, closed surface is inflated to progressively interpolate the input strokes, sampled into lists of points. These are treated in a divide-and-conquer manner, which allows for automatically triggering some additional balloon inflation followed by fusion if the current inflation stops due to a detected concavity. While the input strokes are intended to belong to the same smooth 3D shape, our method is robust to coarse VR input and does not require strokes to be aligned. We simply avoid intersecting strokes that might give an inconsistent surface position due to the roughness of the VR drawing. Moreover, no additional topological information is required, and all the user needs to do is specify the initial seeding location for the first balloon. The results show that VRsurf can efficiently generate smooth surfaces that interpolate sparse sets of unoriented strokes. Validation includes a side-by-side comparison with other reconstruction methods on the same input VR sketch. We also check that our solution matches the user's intent by applying it to strokes that were sketched on an existing 3D shape and comparing what we get to the original one.
ReConForM: Real-time Contact-aware Motion Retargeting for more Diverse Character Morphologies
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cheynel, Théo; Rossi, Thomas; Bellot-Gurlet, Baptiste; Rohmer, Damien; Cani, Marie-Paule; Bousseau, Adrien; Day, Angela
Preserving semantics, in particular in terms of contacts, is a key challenge when retargeting motion between characters of different morphologies. Our solution relies on a low-dimensional embedding of the character's mesh, based on rigged key vertices that are automatically transferred from the source to the target. Motion descriptors are extracted from the trajectories of these key vertices, providing an embedding that contains combined semantic information about both shape and pose. A novel, adaptive algorithm is then used to automatically select and weight the most relevant features over time, enabling us to efficiently optimize the target motion until it conforms to these constraints, so as to preserve the semantics of the source motion. Our solution allows extensions to several novel use-cases where morphology and mesh contacts were previously overlooked, such as multi-character retargeting and motion transfer on uneven terrains. As our results show, our method is able to achieve real-time retargeting onto a wide variety of characters. Extensive experiments and comparison with state-of-the-art methods using several relevant metrics demonstrate improved results, both in terms of motion smoothness and contact accuracy.
Neural Film Grain Rendering
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Lesné, Gwilherm; Gousseau, Yann; Ladjal, Saïd; Newson, Alasdair; Bousseau, Adrien; Day, Angela
Film grain refers to the specific texture of film-acquired images, due to the physical nature of photographic film. Being a visual signature of such images, there is a strong interest in the film-industry for the rendering of these textures for digital images. Some previous works are able to closely mimic the physics of films and produce high quality results, but are computationally expensive. We propose a method based on a lightweight neural network and a texture aware loss function, achieving realistic results with very low complexity, even for large grains and high resolutions. We evaluate our algorithm both quantitatively and qualitatively with respect to previous work.
Axis-Normalized Ray-Box Intersection
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Friederichs, Fabian; Benthin, Carsten; Grogorick, Steve; Eisemann, Elmar; Magnor, Marcus; Eisemann, Martin; Bousseau, Adrien; Day, Angela
Ray-axis aligned bounding box intersection tests play a crucial role in the runtime performance of many rendering applications, driven not by complexity but mainly by the volume of tests required. While existing solutions were believed to be pretty much optimal in terms of runtime on current hardware, our paper introduces a new intersection test requiring fewer arithmetic operations compared to all previous methods. By transforming the ray we eliminate the need for one third of the traditional bounding-slab tests and achieve a speed enhancement of approximately 13.8% or 10.9%, depending on the compiler.We present detailed runtime analyses in various scenarios.
Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hahlbohm, Florian; Friederichs, Fabian; Weyrich, Tim; Franke, Linus; Kappel, Moritz; Castillo, Susana; Stamminger, Marc; Eisemann, Martin; Magnor, Marcus; Bousseau, Adrien; Day, Angela
3D Gaussian Splats (3DGS) have proven a versatile rendering primitive, both for inverse rendering as well as real-time exploration of scenes. In these applications, coherence across camera frames and multiple views is crucial, be it for robust convergence of a scene reconstruction or for artifact-free fly-throughs. Recent work started mitigating artifacts that break multi-view coherence, including popping artifacts due to inconsistent transparency sorting and perspective-correct outlines of (2D) splats. At the same time, real-time requirements forced such implementations to accept compromises in how transparency of large assemblies of 3D Gaussians is resolved, in turn breaking coherence in other ways. In our work, we aim at achieving maximum coherence, by rendering fully perspective-correct 3D Gaussians while using a high-quality approximation of accurate blending, hybrid transparency, on a per-pixel level, in order to retain real-time frame rates. Our fast and perspectively accurate approach for evaluation of 3D Gaussians does not require matrix inversions, thereby ensuring numerical stability and eliminating the need for special handling of degenerate splats, and the hybrid transparency formulation for blending maintains similar quality as fully resolved per-pixel transparencies at a fraction of the rendering costs. We further show that each of these two components can be independently integrated into Gaussian splatting systems. In combination, they achieve up to 2× higher frame rates, 2× faster optimization, and equal or better image quality with fewer rendering artifacts compared to traditional 3DGS on common benchmarks.
Differentiable Rendering based Part-Aware Occlusion Proxy Generation
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Tan, Zhipeng; Zhang, Yongxiang; Xia, Fei; Ling, Fei; Bousseau, Adrien; Day, Angela
Software occlusion culling has become a prevalent method in modern game engines. It can significantly reduce the rendering cost by using an approximate coarse mesh (occluder) to cull hidden objects. An ideal occluder should use as few faces as possible to represent the original mesh with high culling accuracy. In contrary to mesh simplification, the process of generating a high quality occlusion proxy is not well-established. Existing methods, which simply treat the mesh as a single entity, fall short in addressing complex models with interior structures. By leveraging advanced neural segmentation techniques and the optimization capabilities of differentiable rendering, in combination with a thoughtfully designed part-aware shape fitting and camera placement strategy, our approach can generate high-quality occlusion proxy mesh applicable across a diverse range of models with satisfactory precision, recall and very few faces. Moreover, extensive experiments compellingly demonstrate that our method substantially outperforms both state-of-the-art methodologies and commercial tools in terms of occlusion quality and effectiveness.
Shape-Conditioned Human Motion Diffusion Model with Mesh Representation
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Xue, Kebing; Seo, Hyewon; Bobenrieth, Cédric; Luo, Guoliang; Bousseau, Adrien; Day, Angela
Human motion generation is a key task in computer graphics. While various conditioning signals such as text, action class, or audio have been used to harness the generation process, most existing methods neglect the case where a specific body is desired to perform the motion. Additionally, they rely on skeleton-based pose representations, necessitating additional steps to produce renderable meshes of the intended body shape. Given that human motion involves a complex interplay of bones, joints, and muscles, focusing solely on the skeleton during generation neglects the rich information carried by muscles and soft tissues, as well as their influence on movement, ultimately limiting the variability and precision of the generated motions. In this paper, we introduce Shape-conditioned Motion Diffusion model (SMD), which enables the generation of human motion directly in the form of a mesh sequence, conditioned on both a text prompt and a body mesh. To fully exploit the mesh representation while minimizing resource costs, we employ spectral representation using the graph Laplacian to encode body meshes into the learning process. Unlike retargeting methods, our model does not require source motion data and generates a variety of desired semantic motions that is inherently tailored to the given identity shape. Extensive experimental evaluations show that the SMD model not only maintains the body shape consistently with the conditioning input across motion frames but also achieves competitive performance in text-to-motion and action-to-motion tasks compared to state-of-the-art methods.
InterFaceRays: Interaction-Oriented Furniture Surface Representation for Human Pose Retargeting
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Jin, Taeil; Lee, Yewon; Lee, Sung-Hee; Bousseau, Adrien; Day, Angela
Motion retargeting is a well-established technique in computer animation that adapts source motion to fit characters with different sizes, morphologies, or environments. Recent deep learning methods have shown promising results in retargeting character motion. However, retargeting human-object interactions to new environments, especially when furniture shapes differ significantly, remains a challenging problem. In this work, we propose a novel retargeting framework to address this challenge by combining motion generative models with optimization-based pose adaptation. Our framework operates in two stages: first, a key pose generator generates the pose of key joints that preserves the interaction state relative to the new furniture; second, final whole-body pose is determined by accommodating the key joints' poses through optimization. A crucial step in our framework is generating key poses that maintain the interaction state of the source motion. To achieve this, we introduce the Interaction Intensity Weight (IIW) and structural rays, called InterFaceRays, which together capture the interaction intensity between body parts and furniture surfaces. The IIW generator, a trained MoE-based decoder from the conditional variational autoencoder (cVAE) model, infers IIWs for the target furniture based on the source motion's interaction state. Extensive experiments demonstrate that our framework effectively retargets continuous character motion across diverse furniture configurations, with the IIW generator significantly enhancing key pose consistency. This hybrid approach offers a robust solution for motion retargeting across dissimilar furniture environments.
Eigenvalue Blending for Projected Newton
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cheng, Yuan-Yuan; Liu, Ligang; Fu, Xiao-Ming; Bousseau, Adrien; Day, Angela
We propose a novel method to filter eigenvalues for projected Newton. Central to our method is blending the clamped and absolute eigenvalues to adaptively compute the modified Hessian matrix. To determine the blending coefficients, we rely on (1) a key observation and (2) an objective function descent constraint. The observation is that if the quadratic form defined by the Hessian matrix maps the descent direction to a negative real number, the decrease in the objective function is limited. The constraint is that our eigenvalue filtering leads to more reduction in objective function than the absolute eigenvalue filtering [CLL∗24] in the case of second-order Taylor approximation. Our eigenvalue blending is easy to implement and leads to fewer optimization iterations than the state-of-the-art eigenvalue filtering methods.
Isosurface Extraction for Signed Distance Functions using Power Diagrams
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kohlbrenner, Maximilian; Alexa, Marc; Bousseau, Adrien; Day, Angela
Contouring an implicit function typically considers function values in the vicinity of the desired level set, only. In a recent string of works, Sellán at al. have demonstrated that signed distance values contain useful information also if they are further away from the surface. This can be exploited to increase the resolution and amount of detail in surface reconstruction from signed distance values. We argue that the right tool for this analysis is a regular triangulation of the distance samples, with the weights chosen based on the distance values. The resulting triangulation is better suited for reconstructing the surface than a standard Delaunay triangulation of the samples. Moreover, the dual power diagram encodes the envelope enclosing the surface, consisting of spherical caps. We discuss how this information can be exploited for reconstructing the surface. In particular, the approach based on regular triangulations lends itself well to refining the sample set. Refining the sample set based on the power diagram outperforms other reconstruction methods relative to the sample count.
NePHIM: A Neural Physics-Based Head-Hand Interaction Model
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wagner, Nicolas; Schwanecke, Ulrich; Botsch, Mario; Bousseau, Adrien; Day, Angela
Due to the increasing use of virtual avatars, the animation of head-hand interactions has recently gained attention. To this end, we present a novel volumetric and physics-based interaction simulation. In contrast to previous work, our simulation incorporates temporal effects such as collision paths, respects anatomical constraints, and can detect and simulate skin pulling. As a result, we can achieve more natural-looking interaction animations and take a step towards greater realism. However, like most complex and computationally expensive simulations, ours is not real-time capable even on high-end machines. Therefore, we train small and efficient neural networks as accurate approximations that achieve about 200 FPS on consumer GPUs, about 50 FPS on CPUs, and are learned in less than four hours for one person. In general, our focus is not to generalize the approximation networks to low-resolution head models but to adapt them to more detailed personalized avatars. Nevertheless, we show that these networks can learn to approximate our head-hand interaction model for multiple identities while maintaining computational efficiency. Since the quality of the simulations can only be judged subjectively, we conducted a comprehensive user study which confirms the improved realism of our approach. In addition, we provide extensive visual results and inspect the neural approximations quantitatively. All data used in this work has been recorded with a multi-view camera rig. Code and data are available at https://gitlab.cs.hs-rm.de/cvmr_releases/HeadHand.
A Semi-Implicit SPH Method for Compressible and Incompressible Flows with Improved Convergence
(The Eurographics Association and John Wiley & Sons Ltd., 2025) He, Xiaowei; Liu, Shusen; Guo, Yuzhong; Shi, Jian; Qiao, Ying; Bousseau, Adrien; Day, Angela
In simulating fluids using position-based dynamics, the accuracy and robustness depend on numerous numerical parameters, including the time step size, iteration count, and particle size, among others. This complexity can lead to unpredictable control of simulation behaviors. In this paper, we first reformulate the problem of enforcing fluid compressibility/incompressibility into an nonlinear optimization problem, and then introduce a semi-implicit successive substitution method (SISSM) to solve the nonlinear optimization problem by adjusting particle positions in parallel. In contrast to calculating an intermediate variable, such as pressure, to enforce fluid incompressibility within the position-based dynamics (PBD) framework, the proposed semiimplicit approach eliminates the necessity of such calculations. Instead, it directly employs successive substitution of particle positions to correct density errors. This method exhibits reduced dependency to numerical parameters, such as particle size and time step variations, and improves consistency and stability in simulating fluids that range from highly compressible to nearly incompressible. We validates the effectiveness of applying a variety of different techniques in accelerating the convergence rate.
Neural Face Skinning for Mesh-agnostic Facial Expression Cloning
(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cha, Sihun; Yoon, Serin; Seo, Kwanggyoon; Noh, Junyong; Bousseau, Adrien; Day, Angela
Accurately retargeting facial expressions to a face mesh while enabling manipulation is a key challenge in facial animation retargeting. Recent deep-learning methods address this by encoding facial expressions into a global latent code, but they often fail to capture fine-grained details in local regions. While some methods improve local accuracy by transferring deformations locally, this often complicates overall control of the facial expression. To address this, we propose a method that combines the strengths of both global and local deformation models. Our approach enables intuitive control and detailed expression cloning across diverse face meshes, regardless of their underlying structures. The core idea is to localize the influence of the global latent code on the target mesh. Our model learns to predict skinning weights for each vertex of the target face mesh through indirect supervision from predefined segmentation labels. These predicted weights localize the global latent code, enabling precise and region-specific deformations even for meshes with unseen shapes. We supervise the latent code using Facial Action Coding System (FACS)-based blendshapes to ensure interpretability and allow straightforward editing of the generated animation. Through extensive experiments, we demonstrate improved performance over state-of-the-art methods in terms of expression fidelity, deformation transfer accuracy, and adaptability across diverse mesh structures.

Browse

Browsing EG 2025 - Full Papers - CGF 44-Issue 2 by Issue Date

Results Per Page

Sort Options