Browsing by Author "Yuksel, Cem"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
Item Compacted CPU/GPU Data Compression via Modified Virtual Address Translation(ACM, 2020) Seiler, Larry; Lin, Daqi; Yuksel, Cem; Yuksel, Cem and Membarth, Richard and Zordan, VictorWe propose a method to reduce the footprint of compressed data by using modified virtual address translation to permit random access to the data. This extends our prior work on using page translation to perform automatic decompression and deswizzling upon accesses to fixed rate lossy or lossless compressed data. Our compaction method allows a virtual address space the size of the uncompressed data to be used to efficiently access variable-size blocks of compressed data. Compression and decompression take place between the first and second level caches, which allows fast access to uncompressed data in the first level cache and provides data compaction at all other levels of the memory hierarchy. This improves performance and reduces power relative to compressed but uncompacted data. An important property of our method is that compression, decompression, and reallocation are automatically managed by the new hardware without operating system intervention and without storing compression data in the page tables. As a result, although some changes are required in the page manager, it does not need to know the specific compression algorithm and can use a single memory allocation unit size. We tested our method with two sample CPU algorithms. When performing depth buffer occlusion tests, our method reduces the memory footprint by 3.1x. When rendering into textures, our method reduces the footprint by 1.69x before rendering and 1.63x after. In both cases, the power and cycle time are better than for uncompacted compressed data, and significantly better than for accessing uncompressed data.Item Efficient Adaptive Deferred Shading with Hardware Scatter Tiles(ACM, 2020) Mallett, Ian; Yuksel, Cem; Seiler, Larry; Yuksel, Cem and Membarth, Richard and Zordan, VictorAdaptive shading is an effective mechanism for reducing the number of shaded pixels to a subset of the image resolution with minimal impact on final rendering quality. We present a new scheduling method based on on-chip tiles that, along with relatively minor modifications to the GPU architecture, provides efficient hardware support. As compared to software implementations on current hardware using compute shaders, our approach dramatically reduces memory bandwidth requirements, thereby significantly improving performance and energy use. We also introduce the concept of a fragment pre-shader for programmatically controlling when a fragment shader is invoked, and describe advanced techniques for utilizing our approach to further reduce the number of shaded pixels via temporal filtering, or to adjust rendering quality to maintain stable framerates.Item Hardware Adaptive High-Order Interpolation for Real-Time Graphics(The Eurographics Association and John Wiley & Sons Ltd., 2021) Lin, Daqi; Seiler, Larry; Yuksel, Cem; Binder, Nikolaus and Ritschel, TobiasInterpolation is a core operation that has widespread use in computer graphics. Though higher-order interpolation provides better quality, linear interpolation is often preferred due to its simplicity, performance, and hardware support. We present a unified refactoring of quadratic and cubic interpolations as standard linear interpolation plus linear interpolations of higher-order terms and show how they can be applied to regular grids and (triangular/tetrahedral) simplexes Our formulations can provide significant reduction in computation cost, as compared to typical higher-order interpolations and prior approaches that utilize existing hardware linear interpolation support to achieve higher-order interpolation. In addition, our formulation allows approximating the results by dynamically skipping some higher order terms with low weights for further savings in both computation and storage. Thus, higher-order interpolation can be performed adaptively, as needed. We also describe how relatively minor modifications to existing GPU hardware could provide hardware support for quadratic and cubic interpolations using our approach for both texture filtering operations and barycentric interpolation. We present a variety of examples using triangular, rectangular, tetrahedral, and cuboidal interpolations, showing the effectiveness of our higher-order interpolations in different applications.Item Hardware-Accelerated Dual-Split Trees(ACM, 2020) Lin, Daqi; Vasiou, Elena; Yuksel, Cem; Kopta, Daniel; Brunvand, Erik; Yuksel, Cem and Membarth, Richard and Zordan, VictorBounding volume hierarchies (BVH) are the most widely used acceleration structures for ray tracing due to their high construction and traversal performance. However, the bounding planes shared between parent and children bounding boxes is an inherent storage redundancy that limits further improvement in performance due to the memory cost of reading these redundant planes. Dual-split trees can create identical space partitioning as BVHs, but in a compact form using less memory by eliminating the redundancies of the BVH structure representation. This reduction in memory storage and data movement translates to faster ray traversal and better energy efficiency. Yet, the performance benefits of dual-split trees are undermined by the processing required to extract the necessary information from their compact representation. This involves bit manipulations and branching instructions which are inefficient in software. We introduce hardware acceleration for dual-split trees and show that the performance advantages over BVHs are emphasized in a hardware ray tracing context that can take advantage of such acceleration.We provide details on how the operations needed for decoding dual-split tree nodes can be implemented in hardware and present experiments in a number of scenes with different sizes using path tracing. In our experiments, we have observed up to 31% reduction in render time and 38% energy saving using dual-split trees as compared to binary BVHs representing identical space partitioning.Item High-Performance Polynomial Root Finding for Graphics(ACM Association for Computing Machinery, 2022) Yuksel, Cem; Josef Spjut; Marc Stamminger; Victor ZordanWe present a computationally-efficient and numerically-robust algorithm for finding real roots of polynomials. It begins with determining the intervals where the given polynomial is monotonic. Then, it performs a robust variant of Newton iterations to find the real root within each interval, providing fast and guaranteed convergence and satisfying the given error bound, as permitted by the numerical precision used. For cubic polynomials, the algorithm is more accurate and faster than both the analytical solution and directly applying Newton iterations. It trivially extends to polynomials with arbitrary degrees, but it is limited to finding the real roots only and has quadratic worst-case complexity in terms of the polynomial's degree. We show that our method outperforms alternative polynomial solutions we tested up to degree 20. We also present an example rendering application with a known efficient numerical solution and show that our method provides faster, more accurate, and more robust solutions by solving polynomials of degree 10.Item Patch Textures: Hardware Implementation of Mesh Colors(The Eurographics Association, 2019) Mallett, Ian; Seiler, Larry; Yuksel, Cem; Steinberger, Markus and Foley, TimMesh colors provide an effective alternative to standard texture mapping. They significantly simplify the asset production pipeline by removing the need for defining a mapping and eliminate rendering artifacts due to seams. This paper addresses the problem that using mesh colors for real-time rendering has not been practical, due to the absence of hardware support. We show that it is possible to provide full hardware texture filtering support for mesh colors with minimal changes to existing GPUs by introducing a hardware-friendly representation for mesh colors that we call patch textures. We discuss the hardware modifications needed for storing and filtering patch textures.Item Quadratic Approximation of Cubic Curves(ACM, 2020) Truong, Nghia; Yuksel, Cem; Seiler, Larry; Yuksel, Cem and Membarth, Richard and Zordan, VictorWe present a simple degree reduction technique for piecewise cubic polynomial splines, converting them into piecewise quadratic splines that maintain the parameterization and C1 continuity. Our method forms identical tangent directions at the interpolated data points of the piecewise cubic spline by replacing each cubic piece with a pair of quadratic pieces. The resulting representation can lead to substantial performance improvements for rendering geometrically complex spline models like hair and fiber-level cloth. Such models are typically represented using cubic splines that are C1-continuous, a property that is preserved with our degree reduction. Therefore, our method can also be considered a new quadratic curve construction approach for high-performance rendering. We prove that it is possible to construct a pair of quadratic curves with C1 continuity that passes through any desired point on the input cubic curve. Moreover, we prove that when the pair of quadratic pieces corresponding to a cubic piece have equal parametric lengths, they join exactly at the parametric center of the cubic piece, and the deviation in positions due to degree reduction is minimized.Item Spectral Primary Decomposition for Rendering with sRGB Reflectance(The Eurographics Association, 2019) Mallett, Ian; Yuksel, Cem; Boubekeur, Tamy and Sen, PradeepSpectral renderers, as-compared to RGB renderers, are able to simulate light transport that is closer to reality, capturing light behavior that is impossible to simulate with any three-primary decomposition. However, spectral rendering requires spectral scene data (e.g. textures and material properties), which is not widely available, severely limiting the practicality of spectral rendering. Unfortunately, producing a physically valid reflectance spectrum from a given sRGB triple has been a challenging problem, and indeed until very recently constructing a spectrum without colorimetric round-trip error was thought to be impossible. In this paper, we introduce a new procedure for efficiently generating a reflectance spectrum from any given sRGB input data. We show for the first time that it is possible to create any sRGB reflectance spectrum as a linear combination of three separate spectra, each directly corresponding to one of the BT.709 primaries. Our approach produces consistent results, such that the input sRGB value is perfectly reproduced by the corresponding reflectance spectrum under D65 illumination, bounded only by Monte Carlo and numerical error. We provide a complete implementation, including a precomputed spectral basis, and discuss important optimizations and generalization to other RGB spaces.Item Virtual Blue Noise Lighting(ACM Association for Computing Machinery, 2022) Li, Tianyu; Wang, Wenyou; Lin, Daqi; Yuksel, Cem; Josef Spjut; Marc Stamminger; Victor ZordanWe introduce virtual blue noise lighting, a rendering pipeline for estimating indirect illumination with a blue noise distribution of virtual lights. Our pipeline is designed for virtual lights with non-uniform emission profiles that are more expensive to store, but required for properly and efficiently handling specular transport. Unlike the typical virtual light placement approaches that traverse light paths from the original light sources, we generate them starting from the camera. This avoids two important problems: wasted memory and computation with fully-occluded virtual lights, and excessive virtual light density around high-probability light paths. In addition, we introduce a parallel and adaptive sample elimination strategy to achieve a blue noise distribution of virtual lights with varying density. This addresses the third problem of virtual light placement by ensuring that they are not placed too close to each other, providing better coverage of the (indirectly) visible surfaces and further improving the quality of the final lighting estimation. For computing the virtual light emission profiles, we present a photon splitting technique that allows efficiently using a large number of photons, as it does not require storing them. During lighting estimation, our method allows using both global power-based and local BSDF important sampling techniques, combined via multiple importance sampling. In addition, we present an adaptive path extension method that avoids sampling nearby virtual lights for reducing the estimation error. We show that our method significantly outperforms path tracing and prior work in virtual lights in terms of both performance and image quality, producing a fast but biased estimate of global illumination.