Browsing by Author "Serrano, Ana"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Advances on computational imaging, material appearance, and virtual reality(Universidad de Zaragoza, 2019-04-29) Serrano, AnaVisual computing is a recently coined term that embraces many subfields in computer science related to the acquisition, analysis, or synthesis of visual data through the use of computer resources. What brings all these fields together is that they are all related to the visual aspects of computing, and more importantly, that during the last years they have started to share similar goals and methods. This thesis presents contributions in three different areas within the field of visual computing: computational imaging, material appearance, and virtual reality. The first part of this thesis is devoted to computational imaging, and in particular to rich image and video acquisition. First, we deal with the capture of high dynamic range images in a single shot, where we propose a novel reconstruction algorithm based on sparse coding and reconstruction to recover the full range of luminances of the scene being captured from a single coded low dynamic range image. Second, we focus on the temporal domain, where we propose to capture high speed videos via a novel reconstruction algorithm, again based on sparse coding, that allows recovering high speed video sequences from a single photograph with encoded temporal information. The second part attempts to address the long-standing problem of visual perception and editing of real world materials. We propose an intuitive, perceptually based editing space for captured data. We derive a set of meaningful attributes for describing appearance, and we build a control space based on these attributes by means of a large scale user study. Finally, we propose a series of applications for this space. One of these applications to which we devote particular attention is gamut mapping. The range of appearances displayable on a particular display or printer is called the gamut. Given a desired appearance, that may lie outside of that gamut, the process of gamut mapping consists on making it displayable without excessively distorting the final perceived appearance. For this task, we make use of our previously derived perceptually-based space to introduce visual perception in the mapping process to help minimize the perceived visual distortions that may arise during the mapping process. The third part is devoted to virtual reality. We first focus on the study of human gaze behavior in static omnistereo panoramas. We collect gaze samples and we provide an analysis of this data, proposing then a series of applications that make use of our derived insights. Then, we investigate more intricate behaviors in dynamic environments under a cinematographic context. We gather gaze data from viewers watching virtual reality videos containing different edits with varying parameters, and provide the first systematic analysis of viewers’ behavior and the perception of continuity in virtual reality video. Finally, we propose a novel method for adding parallax for 360◦ video visualization in virtual reality headsets.Item CEIG 2022: Frontmatter(The Eurographics Association, 2022) Serrano, Ana; Posada, Jorge; Serrano, Ana; Posada, JorgeItem Convolutional Sparse Coding for Capturing High‐Speed Video Content(© 2017 The Eurographics Association and John Wiley & Sons Ltd., 2017) Serrano, Ana; Garces, Elena; Masia, Belen; Gutierrez, Diego; Chen, Min and Zhang, Hao (Richard)Video capture is limited by the trade‐off between spatial and temporal resolution: when capturing videos of high temporal resolution, the spatial resolution decreases due to bandwidth limitations in the capture system. Achieving both high spatial temporal resolution is only possible with highly specialized and very expensive hardware, and even then the same basic trade‐off remains. The recent introduction of compressive sensing and sparse reconstruction techniques allows for the capture of high‐speed video, by coding the temporal information in a single frame, and then reconstructing the full video sequence from this single‐coded image and a trained dictionary of image patches. In this paper, we first analyse this approach, and find insights that help improve the quality of the reconstructed videos. We then introduce a novel technique, based on (CSC), and show how it outperforms the state‐of‐the‐art, patch‐based approach in terms of flexibility and efficiency, due to the convolutional nature of its filter banks. The key idea for CSC high‐speed video acquisition is extending the basic formulation by imposing an additional constraint in the temporal dimension, which enforces sparsity of the first‐order derivatives over time.Video capture is limited by the trade‐off between spatial and temporal resolution: when capturing videos of high temporal resolution, the spatial resolution decreases due to bandwidth limitations in the capture system. Achieving both high spatial and temporal resolution is only possible with highly specialized and very expensive hardware, and even then the same basic trade‐off remains. .Item EUROGRAPHICS 2023: Tutorials Frontmatter(Eurographics Association, 2023) Serrano, Ana; Slusallek, Philipp; Serrano, Ana; Slusallek, PhilippItem Exploring the Impact of 360º Movie Cuts in Users' Attention(The Eurographics Association, 2020) Marañes, Carlos; Gutierrez, Diego; Serrano, Ana; Christie, Marc and Wu, Hui-Yin and Li, Tsai-Yen and Gandhi, VineetVirtual Reality (VR) has become more relevant since the first devices for personal use became available on the market. New content has emerged for this new medium with different purposes such as education, traning, entertainment, etc. However, the production workflow of cinematic VR content is still in an experimental phase. The main reason is that there is controversy between content creators on how to tell a story effectively. Unlike traditional filmmaking, which has been in development for more than 100 years, movie editing in VR has brought new challenges to be addressed. Now viewers have partial control of the camera and can watch every degree of the 360º that surrounds them, with the possibility of losing important aspects of the scene that are key to understand the narrative of the movie. Directors can decide how to edit the film by combining the different shots. Nevertheless, depending on the scene before and after the cut, viewers' behavior may be influenced. To address this issue, we analyze users' behavior through cuts in a professional movie, where the narrative plays an important role, and derive new insights that could potentially influence VR content creation, informing content creators about the impact of different cuts in viewers' behavior.Item Learning a Self-supervised Tone Mapping Operator via Feature Contrast Masking Loss(The Eurographics Association and John Wiley & Sons Ltd., 2022) Wang, Chao; Chen, Bin; Seidel, Hans-Peter; Myszkowski, Karol; Serrano, Ana; Chaine, Raphaëlle; Kim, Min H.High Dynamic Range (HDR) content is becoming ubiquitous due to the rapid development of capture technologies. Nevertheless, the dynamic range of common display devices is still limited, therefore tone mapping (TM) remains a key challenge for image visualization. Recent work has demonstrated that neural networks can achieve remarkable performance in this task when compared to traditional methods, however, the quality of the results of these learning-based methods is limited by the training data. Most existing works use as training set a curated selection of best-performing results from existing traditional tone mapping operators (often guided by a quality metric), therefore, the quality of newly generated results is fundamentally limited by the performance of such operators. This quality might be even further limited by the pool of HDR content that is used for training. In this work we propose a learning-based self-supervised tone mapping operator that is trained at test time specifically for each HDR image and does not need any data labeling. The key novelty of our approach is a carefully designed loss function built upon fundamental knowledge on contrast perception that allows for directly comparing the content in the HDR and tone mapped images. We achieve this goal by reformulating classic VGG feature maps into feature contrast maps that normalize local feature differences by their average magnitude in a local neighborhood, allowing our loss to account for contrast masking effects. We perform extensive ablation studies and exploration of parameters and demonstrate that our solution outperforms existing approaches with a single set of fixed parameters, as confirmed by both objective and subjective metrics.Item Modeling Surround-aware Contrast Sensitivity(The Eurographics Association, 2021) Yi, Shinyoung; Jeon, Daniel S.; Serrano, Ana; Jeong, Se-Yoon; Kim, Hui-Yong; Gutierrez, Diego; Kim, Min H.; Bousseau, Adrien and McGuire, MorganDespite advances in display technology, many existing applications rely on psychophysical datasets of human perception gathered using older, sometimes outdated displays. As a result, there exists the underlying assumption that such measurements can be carried over to the new viewing conditions of more modern technology. We have conducted a series of psychophysical experiments to explore contrast sensitivity using a state-of-the-art HDR display, taking into account not only the spatial frequency and luminance of the stimuli but also their surrounding luminance levels. From our data, we have derived a novel surroundaware contrast sensitivity function (CSF), which predicts human contrast sensitivity more accurately. We additionally provide a practical version that retains the benefits of our full model, while enabling easy backward compatibility and consistently producing good results across many existing applications that make use of CSF models. We show examples of effective HDR video compression using a transfer function derived from our CSF, tone-mapping, and improved accuracy in visual difference prediction.Item Modelling Surround‐aware Contrast Sensitivity for HDR Displays(© 2022 Eurographics ‐ The European Association for Computer Graphics and John Wiley & Sons Ltd, 2022) Yi, Shinyoung; Jeon, Daniel S.; Serrano, Ana; Jeong, Se‐Yoon; Kim, Hui‐Yong; Gutierrez, Diego; Kim, Min H.; Hauser, Helwig and Alliez, PierreDespite advances in display technology, many existing applications rely on psychophysical datasets of human perception gathered using older, sometimes outdated displays. As a result, there exists the underlying assumption that such measurements can be carried over to the new viewing conditions of more modern technology. We have conducted a series of psychophysical experiments to explore contrast sensitivity using a state‐of‐the‐art HDR display, taking into account not only the spatial frequency and luminance of the stimuli but also their surrounding luminance levels. From our data, we have derived a novel surround‐aware contrast sensitivity function (CSF), which predicts human contrast sensitivity more accurately. We additionally provide a practical version that retains the benefits of our full model, while enabling easy backward compatibility and consistently producing good results across many existing applications that make use of CSF models. We show examples of effective HDR video compression using a transfer function derived from our CSF, tone‐mapping and improved accuracy in visual difference prediction.Item Structure-preserving Style Transfer(The Eurographics Association, 2019) Calvo, Santiago; Serrano, Ana; Gutierrez, Diego; Masia, Belen; Casas, Dan and Jarabo, AdriánTransferring different artistic styles to images while preserving their content is a difficult image processing task. Since the seminal deep learning approach of Gatys et al. [GEB16], many recent works have proposed different approaches for performing this task. However, most of them share one major limitation: a trade-off between how much the target style is transferred, and how much the content of the original source image is preserved [GEB16, GEB*17, HB17, LPSB17]. In this work, we present a structure-preserving approach for style transfer that builds on top of the approach proposed by Gatys et al. Our approach allows to preserve regions of fine detail by lowering the intensity of the style transfer for such regions, while still conveying the desired style in the overall appearance of the image. We propose to use a quad-tree image subdivision, and then apply the style transfer operation differently for different subdivision levels. Effectively, this leads to a more intense style transfer in large flat regions, while the content is better preserved in areas with fine structure and details. Our approach can be easily applied to different style transfer approaches as a post-processing step.