44-Issue 2
Permanent URI for this collection
Browse
Browsing 44-Issue 2 by Issue Date
Now showing 1 - 20 of 75
Results Per Page
Sort Options
Item SOBB: Skewed Oriented Bounding Boxes for Ray Tracing(The Eurographics Association and John Wiley & Sons Ltd., 2025) Kácerik, Martin; Bittner, Jirí; Bousseau, Adrien; Day, AngelaWe propose skewed oriented bounding boxes (SOBB) as a novel bounding primitive for accelerating the calculation of rayscene intersections. SOBBs have the same memory footprint as the well-known oriented bounding boxes (OBB) and can be used with a similar ray intersection algorithm. We propose an efficient algorithm for constructing a BVH with SOBBs, using a transformation from a standard BVH built for axis-aligned bounding boxes (AABB). We use discrete orientation polytopes as a temporary bounding representation to find tightly fitting SOBBs. Additionally, we propose a compression scheme for SOBBs that makes their memory requirements comparable to those of AABBs. For secondary rays, the SOBB BVH provides a ray tracing speedup of 1.0-11.0x over the AABB BVH and it is 1.1x faster than the OBB BVH on average. The transformation of AABB BVH to SOBB BVH is, on average, 2.6x faster than the ditetrahedron-based AABB BVH to OBB BVH transformation.Item Infusion: Internal Diffusion for Inpainting of Dynamic Textures and Complex Motion(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cherel, Nicolas; Almansa, Andrés; Gousseau, Yann; Newson, Alasdair; Bousseau, Adrien; Day, AngelaVideo inpainting is the task of filling a region in a video in a visually convincing manner. It is very challenging due to the high dimensionality of the data and the temporal consistency required for obtaining convincing results. Recently, diffusion models have shown impressive results in modeling complex data distributions, including images and videos. Such models remain nonetheless very expensive to train and to perform inference with, which strongly reduce their applicability to videos, and yields unreasonable computational loads. We show that in the case of video inpainting, thanks to the highly auto-similar nature of videos, the training data of a diffusion model can be restricted to the input video and still produce very satisfying results. With this internal learning approach, where the training data is limited to a single video, our lightweight models perform very well with only half a million parameters, in contrast to the very large networks with billions of parameters typically found in the literature. We also introduce a new method for efficient training and inference of diffusion models in the context of internal learning, by splitting the diffusion process into different learning intervals corresponding to different noise levels of the diffusion process. We show qualitative and quantitative results, demonstrating that our method reaches or exceeds state of the art performance in the case of dynamic textures and complex dynamic backgrounds.Item Learning Fast 3D Gaussian Splatting Rendering using Continuous Level of Detail(The Eurographics Association and John Wiley & Sons Ltd., 2025) Milef, Nicholas; Seyb, Dario; Keeler, Todd; Nguyen-Phuoc, Thu; Bozic, Aljaz; Kondguli, Sushant; Marshall, Carl; Bousseau, Adrien; Day, Angela3D Gaussian splatting (3DGS) has shown potential for rendering photorealistic 3D scenes in real-time. Unfortunately, rendering these scenes on less powerful hardware is still a challenge, especially with high-resolution displays. We introduce a continuous level of detail (CLOD) algorithm and demonstrate how our method can improve performance while preserving as much quality as possible. Our approach learns to order splats based on importance and optimize them such that a representative and realistic scene can be rendered for an arbitrary splat count. Our method does not require any additional memory or rendering overhead and works with existing 3DGS renderers. We also demonstrate the flexibility of our CLOD method by extending it with distance-based LOD selection, foveated rendering, and budget-based rendering.Item Towards Scaling-Invariant Projections for Data Visualization(The Eurographics Association and John Wiley & Sons Ltd., 2025) Dierkes, Joel; Stelter, Daniel; Rössl, Christian; Theisel, Holger; Bousseau, Adrien; Day, AngelaFinding projections of multidimensional data domains to the 2D screen space is a well-known problem. Multidimensional data often comes with the property that the dimensions are measured in different physical units, which renders the ratio between dimensions, i.e., their scale, arbitrary. The result of common projections, like PCA, t-SNE, or MDS, depends on this ratio, i.e., these projections are variant to scaling. This results in an undesired subjective view of the data, and thus, their projection. Simple solutions like normalization of each dimension are widely used, but do not always give high-quality results. We propose to visually analyze the space of all scalings and to find optimal scalings w.r.t. the quality of the visualization. For this, we evaluate different quality criteria on scatter plots. Given a quality criterion, our approach finds scalings that yield good visualizations with little to no user input using numerical optimization. Simultaneously, our method results in a scaling invariant projection, proposing an objective view to the projected data. We show for several examples that such an optimal scaling can significantly improve the visualization quality.Item EUROGRAPHICS 2025: CGF 44-2 Frontmatter(The Eurographics Association and John Wiley & Sons Ltd., 2025) Dai, Angela; Bousseau, Adrien; Dai, Angela; Bousseau, AdrienItem Mesh Compression with Quantized Neural Displacement Fields(The Eurographics Association and John Wiley & Sons Ltd., 2025) Pentapati, Sai Karthikey; Phillips, Gregoire; Bovik, Alan C.; Bousseau, Adrien; Day, AngelaImplicit neural representations (INRs) have been successfully used to compress a variety of 3D surface representations such as Signed Distance Functions (SDFs), voxel grids, and also other forms of structured data such as images, videos, and audio. However, these methods have been limited in their application to unstructured data such as 3D meshes and point clouds. This work presents a simple yet effective method that extends the usage of INRs to compress 3D triangle meshes. Our method encodes a displacement field that refines the coarse version of the 3D mesh surface to be compressed using a small neural network. Once trained, the neural network weights occupy much lower memory than the displacement field or the original surface. We show that our method is capable of preserving intricate geometric textures and demonstrates state-of-the-art performance for compression ratios ranging from 4x to 380x (See Figure 1 for an example).Item NePHIM: A Neural Physics-Based Head-Hand Interaction Model(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wagner, Nicolas; Schwanecke, Ulrich; Botsch, Mario; Bousseau, Adrien; Day, AngelaDue to the increasing use of virtual avatars, the animation of head-hand interactions has recently gained attention. To this end, we present a novel volumetric and physics-based interaction simulation. In contrast to previous work, our simulation incorporates temporal effects such as collision paths, respects anatomical constraints, and can detect and simulate skin pulling. As a result, we can achieve more natural-looking interaction animations and take a step towards greater realism. However, like most complex and computationally expensive simulations, ours is not real-time capable even on high-end machines. Therefore, we train small and efficient neural networks as accurate approximations that achieve about 200 FPS on consumer GPUs, about 50 FPS on CPUs, and are learned in less than four hours for one person. In general, our focus is not to generalize the approximation networks to low-resolution head models but to adapt them to more detailed personalized avatars. Nevertheless, we show that these networks can learn to approximate our head-hand interaction model for multiple identities while maintaining computational efficiency. Since the quality of the simulations can only be judged subjectively, we conducted a comprehensive user study which confirms the improved realism of our approach. In addition, we provide extensive visual results and inspect the neural approximations quantitatively. All data used in this work has been recorded with a multi-view camera rig. Code and data are available at https://gitlab.cs.hs-rm.de/cvmr_releases/HeadHand.Item Material Transforms from Disentangled NeRF Representations(The Eurographics Association and John Wiley & Sons Ltd., 2025) Lopes, Ivan; Lalonde, Jean-François; Charette, Raoul de; Bousseau, Adrien; Day, AngelaIn this paper, we first propose a novel method for transferring material transformations across different scenes. Building on disentangled Neural Radiance Field (NeRF) representations, our approach learns to map Bidirectional Reflectance Distribution Functions (BRDF) from pairs of scenes observed in varying conditions, such as dry and wet. The learned transformations can then be applied to unseen scenes with similar materials, therefore effectively rendering the transformation learned with an arbitrary level of intensity. Extensive experiments on synthetic scenes and real-world objects validate the effectiveness of our approach, showing that it can learn various transformations such as wetness, painting, coating, etc. Our results highlight not only the versatility of our method but also its potential for practical applications in computer graphics. We publish our method implementation, along with our synthetic/real datasets on https://github.com/astra-vision/BRDFTransformItem Generative Motion Infilling from Imprecisely Timed Keyframes(The Eurographics Association and John Wiley & Sons Ltd., 2025) Goel, Purvi; Zhang, Haotian; Liu, C. Karen; Fatahalian, Kayvon; Bousseau, Adrien; Day, AngelaKeyframes are a standard representation for kinematic motion specification. Recent learned motion-inbetweening methods use keyframes as a way to control generative motion models, and are trained to generate life-like motion that matches the exact poses and timings of input keyframes. However, the quality of generated motion may degrade if the timing of these constraints is not perfectly consistent with the desired motion. Unfortunately, correctly specifying keyframe timings is a tedious and challenging task in practice. Our goal is to create a system that synthesizes high-quality motion from keyframes, even if keyframes are imprecisely timed. We present a method that allows constraints to be retimed as part of the generation process. Specifically, we introduce a novel model architecture that explicitly outputs a time-warping function to correct mistimed keyframes, and spatial residuals that add pose details. We demonstrate how our method can automatically turn approximately timed keyframe constraints into diverse, realistic motions with plausible timing and detailed submovements.Item Differential Diffusion: Giving Each Pixel Its Strength(The Eurographics Association and John Wiley & Sons Ltd., 2025) Levin, Eran; Fried, Ohad; Bousseau, Adrien; Day, AngelaDiffusion models have revolutionized image generation and editing, producing state-of-the-art results in conditioned and unconditioned image synthesis. While current techniques enable user control over the degree of change in an image edit, the controllability is limited to global changes over an entire edited region. This paper introduces a novel framework that enables customization of the amount of change per pixel or per image region. Our framework can be integrated into any existing diffusion model, enhancing it with this capability. Such granular control opens up a diverse array of new editing capabilities, such as control of the extent to which individual objects are modified, or the ability to introduce gradual spatial changes. Furthermore, we showcase the framework's effectiveness in soft-inpainting-the completion of portions of an image while subtly adjusting the surrounding areas to ensure seamless integration. Additionally, we introduce a new tool for exploring the effects of different change quantities. Our framework operates solely during inference, requiring no model training or fine-tuning. We demonstrate our method with the current open state-of-the-art models, and validate it via both quantitative and qualitative comparisons, and a user study. Our code is published and integrated into several platforms.Item ReConForM: Real-time Contact-aware Motion Retargeting for more Diverse Character Morphologies(The Eurographics Association and John Wiley & Sons Ltd., 2025) Cheynel, Théo; Rossi, Thomas; Bellot-Gurlet, Baptiste; Rohmer, Damien; Cani, Marie-Paule; Bousseau, Adrien; Day, AngelaPreserving semantics, in particular in terms of contacts, is a key challenge when retargeting motion between characters of different morphologies. Our solution relies on a low-dimensional embedding of the character's mesh, based on rigged key vertices that are automatically transferred from the source to the target. Motion descriptors are extracted from the trajectories of these key vertices, providing an embedding that contains combined semantic information about both shape and pose. A novel, adaptive algorithm is then used to automatically select and weight the most relevant features over time, enabling us to efficiently optimize the target motion until it conforms to these constraints, so as to preserve the semantics of the source motion. Our solution allows extensions to several novel use-cases where morphology and mesh contacts were previously overlooked, such as multi-character retargeting and motion transfer on uneven terrains. As our results show, our method is able to achieve real-time retargeting onto a wide variety of characters. Extensive experiments and comparison with state-of-the-art methods using several relevant metrics demonstrate improved results, both in terms of motion smoothness and contact accuracy.Item Many-Light Rendering Using ReSTIR-Sampled Shadow Maps(The Eurographics Association and John Wiley & Sons Ltd., 2025) Zhang, Song; Lin, Daqi; Wyman, Chris; Yuksel, Cem; Bousseau, Adrien; Day, AngelaWe present a practical method targeting dynamic shadow maps for many light sources in real-time rendering. We compute fullresolution shadow maps for a subset of lights, which we select with spatiotemporal reservoir resampling (ReSTIR). Our selection strategy automatically regenerates shadow maps for lights with the strongest contributions to pixels in the current camera view. The remaining lights are handled using imperfect shadow maps, which provide low-resolution shadow approximation. We significantly reduce the computation and storage compared to using all full-resolution shadow maps and substantially improve shadow quality compared to handling all lights with imperfect shadow maps.Item Bracket Diffusion: HDR Image Generation by Consistent LDR Denoising(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bemana, Mojtaba; Leimkühler, Thomas; Myszkowski, Karol; Seidel, Hans-Peter; Ritschel, Tobias; Bousseau, Adrien; Day, AngelaWe demonstrate generating HDR images using the concerted action of multiple black-box, pre-trained LDR image diffusion models. Common diffusion models are not HDR as, first, there is no sufficiently large HDR image dataset available to re-train them, and, second, even if it was, re-training such models is impossible for most compute budgets. Instead, we seek inspiration from the HDR image capture literature that traditionally fuses sets of LDR images, called ''exposure brackets'', to produce a single HDR image. We operate multiple denoising processes to generate multiple LDR brackets that together form a valid HDR result. To this end, we introduce a brackets consistency term into the diffusion process to couple the brackets such that they agree across the exposure range they share. We demonstrate HDR versions of state-of-the-art unconditional and conditional as well as restoration-type (LDR2HDR) generative modeling.Item ASMR: Adaptive Skeleton-Mesh Rigging and Skinning via 2D Generative Prior(The Eurographics Association and John Wiley & Sons Ltd., 2025) Hong, Seokhyeon; Choi, Soojin; Kim, Chaelin; Cha, Sihun; Noh, Junyong; Bousseau, Adrien; Day, AngelaDespite the growing accessibility of skeletal motion data, integrating it for animating character meshes remains challenging due to diverse configurations of both skeletons and meshes. Specifically, the body scale and bone lengths of the skeleton should be adjusted in accordance with the size and proportions of the mesh, ensuring that all joints are accurately positioned within the character mesh. Furthermore, defining skinning weights is complicated by variations in skeletal configurations, such as the number of joints and their hierarchy, as well as differences in mesh configurations, including their connectivity and shapes. While existing approaches have made efforts to automate this process, they hardly address the variations in both skeletal and mesh configurations. In this paper, we present a novel method for the automatic rigging and skinning of character meshes using skeletal motion data, accommodating arbitrary configurations of both meshes and skeletons. The proposed method predicts the optimal skeleton aligned with the size and proportion of the mesh as well as defines skinning weights for various meshskeleton configurations, without requiring explicit supervision tailored to each of them. By incorporating Diffusion 3D Features (Diff3F) as semantic descriptors of character meshes, our method achieves robust generalization across different configurations. To assess the performance of our method in comparison to existing approaches, we conducted comprehensive evaluations encompassing both quantitative and qualitative analyses, specifically examining the predicted skeletons, skinning weights, and deformation quality.Item A Multimodal Personality Prediction Framework based on Adaptive Graph Transformer Network and Multi-task Learning(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wang, Rongquan; Zhao, Xile; Xu, Xianyu; Hao, Yang; Bousseau, Adrien; Day, AngelaMultimodal personality analysis targets accurately detecting personality traits by incorporating related multimodal information. However, existing methods focus on unimodal features while overlooking the bimodal association features crucial for this interdisciplinary task. Therefore, we propose a multimodal personality prediction framework based on an adaptive graph transformer network and multi-task learning. Firstly, we utilize pre-trained models to learn specific representations from different modalities. Here, we employ pre-trained multimodal models' encoders as the backbones of the modality-specific extraction methods to mine unimodal features. Specifically, we introduce a novel adaptive graph transformer network to mine personalityrelated bimodal association features. This network effectively learns higher-order temporal dependencies based on relational graphs and emphasizes more significant features. Furthermore, we utilize a multimodal channel attention residual fusion module to obtain the fused features, and we propose a multimodal and unimodal joint learning regression head to learn and predict scores for personality traits. We design a multi-task loss function to enhance the robustness and accuracy of personality prediction. Experimental results on the two benchmark datasets demonstrate the effectiveness of our framework, which outperforms the state-of-the-art methods. The code is available at https://github.com/RongquanWang/PPF-AGTNMTL.Item "Wild West" of Evaluating Speech-Driven 3D Facial Animation Synthesis: A Benchmark Study(The Eurographics Association and John Wiley & Sons Ltd., 2025) Haque, Kazi Injamamul; Pavlou, Alkiviadis; Yumak, Zerrin; Bousseau, Adrien; Day, AngelaRecent advancements in the field of audio-driven 3D facial animation have accelerated rapidly, with numerous papers being published in a short span of time. This surge in research has garnered significant attention from both academia and industry with its potential applications on digital humans. Various approaches, both deterministic and non-deterministic, have been explored based on foundational advancements in deep learning algorithms. However, there remains no consensus among researchers on standardized methods for evaluating these techniques. Additionally, rather than converging on a common set of datasets and objective metrics suited for specific methods, recent works exhibit considerable variation in experimental setups. This inconsistency complicates the research landscape, making it difficult to establish a streamlined evaluation process and rendering many cross-paper comparisons challenging. Moreover, the common practice of A/B testing in perceptual studies focus only on two common metrics and not sufficient for non-deterministic and emotion-enabled approaches. The lack of correlations between subjective and objective metrics points out that there is a need for critical analysis in this space. In this study, we address these issues by benchmarking state-of-the-art deterministic and non-deterministic models, utilizing a consistent experimental setup across a carefully curated set of objective metrics and datasets. We also conduct a perceptual user study to assess whether subjective perceptual metrics align with the objective metrics. Our findings indicate that model rankings do not necessarily generalize across datasets, and subjective metric ratings are not always consistent with their corresponding objective metrics. The supplementary video, edited code scripts for training on different datasets and documentation related to this benchmark study are made publicly available- https://galib360.github.io/face-benchmark-project/.Item Deformed Tiling and Blending: Application to the Correction of Distortions Implied by Texture Mapping(The Eurographics Association and John Wiley & Sons Ltd., 2025) Wendling, Quentin; Ravaglia, Joris; Sauvage, Basile; Bousseau, Adrien; Day, AngelaThe prevailing model in virtual 3D scenes is a 3D surface, which a texture is mapped onto, through a parameterization from the texture plane. We focus on accounting for the parameterization during the texture creation process, to control the deformations and remove the cuts induced by the mapping. We rely on the tiling and blending, a real-time and parallel algorithm that generates an arbitrary large texture from a small input example. Our first contribution is to enhance the tiling and blending with a deformation field, which controls smooth spatial variations in the texture plane. Our second contribution is to derive, from a parameterized triangle mesh, a deformation field to compensate for texture distortions and to control for the texture orientation. Our third contribution is a technique to enforce texture continuity across the cuts, thanks to a proper tile selection. This opens the door to interactive sessions with artistic control, and real-time rendering with improved visual quality.Item REED-VAE: RE-Encode Decode Training for Iterative Image Editing with Diffusion Models(The Eurographics Association and John Wiley & Sons Ltd., 2025) Almog, Gal; Shamir, Ariel; Fried, Ohad; Bousseau, Adrien; Day, AngelaWhile latent diffusion models achieve impressive image editing results, their application to iterative editing of the same image is severely restricted. When trying to apply consecutive edit operations using current models, they accumulate artifacts and noise due to repeated transitions between pixel and latent spaces. Some methods have attempted to address this limitation by performing the entire edit chain within the latent space, sacrificing flexibility by supporting only a limited, predetermined set of diffusion editing operations. We present a re-encode decode (REED) training scheme for variational autoencoders (VAEs), which promotes image quality preservation even after many iterations. Our work enables multi-method iterative image editing: users can perform a variety of iterative edit operations, with each operation building on the output of the previous one using both diffusion based operations and conventional editing techniques. We demonstrate the advantage of REED-VAE across a range of image editing scenarios, including text-based and mask-based editing frameworks. In addition, we show how REEDVAE enhances the overall editability of images, increasing the likelihood of successful and precise edit operations. We hope that this work will serve as a benchmark for the newly introduced task of multi-method image editing.Item Versatile Physics-based Character Control with Hybrid Latent Representation(The Eurographics Association and John Wiley & Sons Ltd., 2025) Bae, Jinseok; Won, Jungdam; Lim, Donggeun; Hwang, Inwoo; Kim, Young Min; Bousseau, Adrien; Day, AngelaWe present a versatile latent representation that enables physically simulated character to efficiently utilize motion priors. To build a powerful motion embedding that is shared across multiple tasks, the physics controller should employ rich latent space that is easily explored and capable of generating high-quality motion. We propose integrating continuous and discrete latent representations to build a versatile motion prior that can be adapted to a wide range of challenging control tasks. Specifically, we build a discrete latent model to capture distinctive posterior distribution without collapse, and simultaneously augment the sampled vector with the continuous residuals to generate high-quality, smooth motion without jittering. We further incorporate Residual Vector Quantization, which not only maximizes the capacity of the discrete motion prior, but also efficiently abstracts the action space during the task learning phase. We demonstrate that our agent can produce diverse yet smooth motions simply by traversing the learned motion prior through unconditional motion generation. Furthermore, our model robustly satisfies sparse goal conditions with highly expressive natural motions, including head-mounted device tracking and motion in-betweening at irregular intervals, which could not be achieved with existing latent representations.Item Text-Guided Interactive Scene Synthesis with Scene Prior Guidance(The Eurographics Association and John Wiley & Sons Ltd., 2025) Fang, Shaoheng; Yang, Haitao; Mooney, Raymond; Huang, Qixing; Bousseau, Adrien; Day, Angela3D scene synthesis using natural language instructions has become a popular direction in computer graphics, with significant progress made by data-driven generative models recently. However, previous methods have mainly focused on one-time scene generation, lacking the interactive capability to generate, update, or correct scenes according to user instructions. To overcome this limitation, this paper focuses on text-guided interactive scene synthesis. First, we introduce the SceneMod dataset, which comprises 168k paired scenes with textual descriptions of the modifications. To support the interactive scene synthesis task, we propose a two-stage diffusion generative model that integrates scene-prior guidance into the denoising process to explicitly enforce physical constraints and foster more realistic scenes. Experimental results demonstrate that our approach outperforms baseline methods in text-guided scene synthesis tasks. Our system expands the scope of data-driven scene synthesis tasks and provides a novel, more flexible tool for users and designers in 3D scene generation. Code and dataset are available at https://github.com/bshfang/SceneMod.