42-Issue 4

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/2633354

Browse

Now showing 1 - 3 of 3

Interactive Control over Temporal Consistency while Stylizing Video Streams
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Shekhar, Sumit; Reimann, Max; Hilscher, Moritz; Semmo, Amir; Döllner, Jürgen; Trapp, Matthias; Ritschel, Tobias; Weidlich, Andrea
Image stylization has seen significant advancement and widespread interest over the years, leading to the development of a multitude of techniques. Extending these stylization techniques, such as Neural Style Transfer (NST), to videos is often achieved by applying them on a per-frame basis. However, per-frame stylization usually lacks temporal consistency, expressed by undesirable flickering artifacts. Most of the existing approaches for enforcing temporal consistency suffer from one or more of the following drawbacks: They (1) are only suitable for a limited range of techniques, (2) do not support online processing as they require the complete video as input, (3) cannot provide consistency for the task of stylization, or (4) do not provide interactive consistency control. Domain-agnostic techniques for temporal consistency aim to eradicate flickering completely but typically disregard aesthetic aspects. For stylization tasks, however, consistency control is an essential requirement as a certain amount of flickering adds to the artistic look and feel. Moreover, making this control interactive is paramount from a usability perspective. To achieve the above requirements, we propose an approach that stylizes video streams in real-time at full HD resolutions while providing interactive consistency control. We develop a lite optical-flow network that operates at 80 Frames per second (FPS) on desktop systems with sufficient accuracy. Further, we employ an adaptive combination of local and global consistency features and enable interactive selection between them. Objective and subjective evaluations demonstrate that our method is superior to state-of-the-art video consistency approaches. maxreimann.github.io/stream-consistency
NEnv: Neural Environment Maps for Global Illumination
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Rodriguez-Pardo, Carlos; Fabre, Javier; Garces, Elena; Lopez-Moreno, Jorge; Ritschel, Tobias; Weidlich, Andrea
Environment maps are commonly used to represent and compute far-field illumination in virtual scenes. However, they are expensive to evaluate and sample from, limiting their applicability to real-time rendering. Previous methods have focused on compression through spherical-domain approximations, or on learning priors for natural, day-light illumination. These hinder both accuracy and generality, and do not provide the probability information required for importance-sampling Monte Carlo integration. We propose NEnv, a deep-learning fully-differentiable method, capable of compressing and learning to sample from a single environment map. NEnv is composed of two different neural networks: A normalizing flow, able to map samples from uniform distributions to the probability density of the illumination, also providing their corresponding probabilities; and an implicit neural representation which compresses the environment map into an efficient differentiable function. The computation time of environment samples with NEnv is two orders of magnitude less than with traditional methods. NEnv makes no assumptions regarding the content (i.e. natural illumination), thus achieving higher generality than previous learning-based approaches. We share our implementation and a diverse dataset of trained neural environment maps, which can be easily integrated into existing rendering engines.
PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN
(The Eurographics Association and John Wiley & Sons Ltd., 2023) Lin, Kai-En; Trevithick, Alex; Cheng, Keli; Sarkis, Michel; Ghafoorian, Mohsen; Bi, Ning; Reitmayr, Gerhard; Ramamoorthi, Ravi; Ritschel, Tobias; Weidlich, Andrea
Portrait synthesis creates realistic digital avatars which enable users to interact with others in a compelling way. Recent advances in StyleGAN and its extensions have shown promising results in synthesizing photorealistic and accurate reconstruction of human faces. However, previous methods often focus on frontal face synthesis and most methods are not able to handle large head rotations due to the training data distribution of StyleGAN. In this work, our goal is to take as input a monocular video of a face, and create an editable dynamic portrait able to handle extreme head poses. The user can create novel viewpoints, edit the appearance, and animate the face. Our method utilizes pivotal tuning inversion (PTI) to learn a personalized video prior from a monocular video sequence. Then we can input pose and expression coefficients to MLPs and manipulate the latent vectors to synthesize different viewpoints and expressions of the subject. We also propose novel loss functions to further disentangle pose and expression in the latent space. Our algorithm shows much better performance over previous approaches on monocular video datasets, and it is also capable of running in real-time at 54 FPS on an RTX 3080.

Browse

Browsing 42-Issue 4 by Subject "Image"

Results Per Page

Sort Options