High-Performance Graphics 2022
Permanent URI for this collection
Browse
Browsing High-Performance Graphics 2022 by Title
Now showing 1 - 13 of 13
Results Per Page
Sort Options
Item Better Fixed-Point Filtering with Averaging Trees(ACM Association for Computing Machinery, 2022) Adams, Andrew; Sharlet, Dillon; Josef Spjut; Marc Stamminger; Victor ZordanProduction imaging pipelines commonly operate using fixed-point arithmetic, and within these pipelines a core primitive is convolution by small filters - taking convex combinations of fixed-point values in order to resample, interpolate, or denoise. We describe a new way to compute unbiased convex combinations of fixedpoint values using sequences of averaging instructions, which exist on all popular CPU and DSP architectures but are seldom used. For a variety of popular kernels, our averaging trees have higher performance and higher quality than existing standard practice.Item Data Parallel Path Tracing with Object Hierarchies(ACM Association for Computing Machinery, 2022) Wald, Ingo; Parker, Steven G; Josef Spjut; Marc Stamminger; Victor ZordanWe propose a new approach to rendering production-style content with full path tracing in a data-distributed fashion-that is, with multiple collaborating nodes and/or GPUs that each store only part of the model. In particular, we propose a new approach to ray-forwarding based data-parallel ray tracing that improves over traditional spatial partitioning, that can support both object-hierarchy and spatial partitioning (or any combination thereof), and that employs multiple techniques for reducing the number of rays sent across the network. We show that this approach can simultaneously achieve higher flexibility in model partitioning, lower memory per node, lower bandwidth during rendering, and higher performance; and that it can ultimately achieve interactive rendering performance for non-trivial models with full path tracing even on quite moderate hardware resources with relatively low-end interconnect.Item A Data-Driven Paradigm for Precomputed Radiance Transfer(ACM Association for Computing Machinery, 2022) Belcour, Laurent; Deliot, Thomas; Barbier, Wilhem; Soler, Cyril; Josef Spjut; Marc Stamminger; Victor ZordanIn this work, we explore a change of paradigm to build Precomputed Radiance Transfer (PRT) methods in a data-driven way. This paradigm shift allows us to alleviate the difficulties of building traditional PRT methods such as defining a reconstruction basis, coding a dedicated path tracer to compute a transfer function, etc. Our objective is to pave the way for Machine Learned methods by providing a simple baseline algorithm. More specifically, we demonstrate real-time rendering of indirect illumination in hair and surfaces from a few measurements of direct lighting.We build our baseline from pairs of direct and indirect illumination renderings using only standard tools such as Singular Value Decomposition (SVD) to extract both the reconstruction basis and transfer function.Item High-Performance Polynomial Root Finding for Graphics(ACM Association for Computing Machinery, 2022) Yuksel, Cem; Josef Spjut; Marc Stamminger; Victor ZordanWe present a computationally-efficient and numerically-robust algorithm for finding real roots of polynomials. It begins with determining the intervals where the given polynomial is monotonic. Then, it performs a robust variant of Newton iterations to find the real root within each interval, providing fast and guaranteed convergence and satisfying the given error bound, as permitted by the numerical precision used. For cubic polynomials, the algorithm is more accurate and faster than both the analytical solution and directly applying Newton iterations. It trivially extends to polynomials with arbitrary degrees, but it is limited to finding the real roots only and has quadratic worst-case complexity in terms of the polynomial's degree. We show that our method outperforms alternative polynomial solutions we tested up to degree 20. We also present an example rendering application with a known efficient numerical solution and show that our method provides faster, more accurate, and more robust solutions by solving polynomials of degree 10.Item Htex: Per-Halfedge Texturing for Arbitrary Mesh Topologies(ACM Association for Computing Machinery, 2022) Barbier, Wilhem; Dupuy, Jonathan; Josef Spjut; Marc Stamminger; Victor ZordanWe introduce per-halfedge texturing (Htex) a GPU-friendly method for texturing arbitrary polygon-meshes without an explicit parameterization. Htex builds upon the insight that halfedges encode an intrinsic triangulation for polygon meshes, where each halfedge spans a unique triangle with direct adjacency information. Rather than storing a separate texture per face of the input mesh as is done by previous parameterization-free texturing methods, Htex stores a square texture for each halfedge and its twin.We show that this simple change from face to halfedge induces two important properties for high performance parameterization-free texturing. First, Htex natively supports arbitrary polygons without requiring dedicated code for, e.g, non-quad faces. Second, Htex leads to a straightforward and efficient GPU implementation that uses only three texture-fetches per halfedge to produce continuous texturing across the entire mesh. We demonstrate the effectiveness of Htex by rendering production assets in real time.Item Issue Information(ACM Association for Computing Machinery, 2022) Josef Spjut; Marc Stamminger; Victor Zordan; Josef Spjut; Marc Stamminger; Victor ZordanItem PLOC++ : Parallel Locally-Ordered Clustering for Bounding Volume Hierarchy Construction Revisited(ACM Association for Computing Machinery, 2022) Benthin, Carsten; Drabinski, Radoslaw; Tessari, Lorenzo; Dittebrandt, Addis; Josef Spjut; Marc Stamminger; Victor ZordanWe propose a novel version of the GPU-oriented massively parallel locally-ordered clustering (PLOC) algorithm for constructing bounding volume hierarchies (BVHs). Our method focuses on removing the weaknesses of the original approach by simplifying and fusing different phases, while replacing most performance critical parts by novel and more efficient algorithms. This combination allows for outperforming the original approach by a factor of 1.9 - 2.3×.Item Ray/Ribbon Intersections(ACM Association for Computing Machinery, 2022) Reshetov, Alexander; Josef Spjut; Marc Stamminger; Victor ZordanWe present a new ray tracing primitive-a curved ribbon, which is embedded inside a ruled surface. We describe two such surfaces. Ribbons inside doubly ruled bilinear patches can be intersected by solving a quadratic equation. We also consider a singly ruled surface with a directrix defined by a quadratic Bézier curve and a generator-by two linearly interpolated bitangent vectors. Intersecting such a surface requires solving a cubic equation, but it provides more fine-tuned control of the ribbon shape. These two primitives are smooth, composable, and allow fast non-iterative intersections. These are the first primitives that possess all such properties simultaneously.Item Software Rasterization of 2 Billion Points in Real Time(ACM Association for Computing Machinery, 2022) Schütz, Markus; Kerbl, Bernhard; Wimmer, Michael; Josef Spjut; Marc Stamminger; Victor ZordanThe accelerated collection of detailed real-world 3D data in the form of ever-larger point clouds is sparking a demand for novel visualization techniques that are capable of rendering billions of point primitives in real-time. We propose a software rasterization pipeline for point clouds that is capable of rendering up to two billion points in real-time (60 FPS) on commodity hardware. Improvements over the state of the art are achieved by batching points, enabling a number of batch-level optimizations before rasterizing them within the same rendering pass. These optimizations include frustum culling, level-of-detail (LOD) rendering, and choosing the appropriate coordinate precision for a given batch of points directly within a compute workgroup. Adaptive coordinate precision, in conjunction with visibility buffers, reduces the required data for the majority of points to just four bytes, making our approach several times faster than the bandwidth-limited state of the art. Furthermore, support for LOD rendering makes our software rasterization approach suitable for rendering arbitrarily large point clouds, and to meet the elevated performance demands of virtual reality applications.Item Spatiotemporal Variance-Guided Filtering for Motion Blur(ACM Association for Computing Machinery, 2022) Oberberger, Max; Chajdas, Matthäus G.; Westermann, Rüdiger; Josef Spjut; Marc Stamminger; Victor ZordanAdding motion blur to a scene can help to convey the feeling of speed even at low frame rates. Monte Carlo ray tracing can compute accurate motion blur, but requires a large number of samples per pixel to converge. In comparison, rasterization, in combination with a post-processing filter, can generate fast, but not accurate motion blur from a single sample per pixel. We build upon a recent path tracing denoiser and propose its variant to simulate ray-traced motion blur, enabling fast and high-quality motion blur from a single sample per pixel. Our approach creates temporally coherent renderings by estimating the motion direction and variance locally, and using these estimates to guide wavelet filters at different scales. We compare image quality against brute force Monte Carlo methods and current post-processing motion blur. Our approach achieves real-time frame rates, requiring less than 4ms for full-screen motion blur at a resolution of 1920 × 1080 on recent graphics cards.Item Supporting Unified Shader Specialization by Co-opting C++ Features(ACM Association for Computing Machinery, 2022) Seitz, Kerry A.; Foley, Theresa; Porumbescu, Serban D.; Owens, John D.; Josef Spjut; Marc Stamminger; Victor ZordanModern unified programming models (such as CUDA and SYCL) that combine host (CPU) code and GPU code into the same programming language, same file, and same lexical scope lack adequate support for GPU code specialization, which is a key optimization in real-time graphics. Furthermore, current methods used to implement specialization do not translate to a unified environment. In this paper, we create a unified shader programming environment in C++ that provides first-class support for specialization by co-opting C++'s attribute and virtual function features and reimplementing them with alternate semantics to express the services required. By co-opting existing features, we enable programmers to use familiar C++ programming techniques to write host and GPU code together, while still achieving efficient generated C++ and HLSL code via our source-to-source translator.Item Temporally Stable Real-Time Joint Neural Denoising and Supersampling(ACM Association for Computing Machinery, 2022) Thomas, Manu Mathew; Liktor, Gabor; Peters, Christoph; Kim, Sungye; Vaidyanathan, Karthik; Forbes, Angus G.; Josef Spjut; Marc Stamminger; Victor ZordanRecent advances in ray tracing hardware bring real-time path tracing into reach, and ray traced soft shadows, glossy reflections, and diffuse global illumination are now common features in games. Nonetheless, ray budgets are still limited. This results in undersampling, which manifests as aliasing and noise. Prior work addresses these issues separately. While temporal supersampling methods based on neural networks have gained a wide use in modern games due to their better robustness, neural denoising remains challenging because of its higher computational cost. We introduce a novel neural network architecture for real-time rendering that combines supersampling and denoising, thus lowering the cost compared to two separate networks. This is achieved by sharing a single low-precision feature extractor with multiple higher-precision filter stages. To reduce cost further, our network takes low-resolution inputs and reconstructs a high-resolution denoised supersampled output. Our technique produces temporally stable high-fidelity results that significantly outperform state-of-the-art real-time statistical or analytical denoisers combined with TAA or neural upsampling to the target resolution. We introduce a novel neural network architecture for real-time rendering that combines supersampling and denoising, thus lowering the cost compared to two separate networks. This is achieved by sharing a single low-precision feature extractor with multiple higher-precision filter stages. To reduce cost further, our network takes low-resolution inputs and reconstructs a high-resolution denoised supersampled output. Our technique produces temporally stable high-fidelity results that significantly outperform state-of-the-art real-time statistical or analytical denoisers combined with TAA or neural upsampling to the target resolution.Item Virtual Blue Noise Lighting(ACM Association for Computing Machinery, 2022) Li, Tianyu; Wang, Wenyou; Lin, Daqi; Yuksel, Cem; Josef Spjut; Marc Stamminger; Victor ZordanWe introduce virtual blue noise lighting, a rendering pipeline for estimating indirect illumination with a blue noise distribution of virtual lights. Our pipeline is designed for virtual lights with non-uniform emission profiles that are more expensive to store, but required for properly and efficiently handling specular transport. Unlike the typical virtual light placement approaches that traverse light paths from the original light sources, we generate them starting from the camera. This avoids two important problems: wasted memory and computation with fully-occluded virtual lights, and excessive virtual light density around high-probability light paths. In addition, we introduce a parallel and adaptive sample elimination strategy to achieve a blue noise distribution of virtual lights with varying density. This addresses the third problem of virtual light placement by ensuring that they are not placed too close to each other, providing better coverage of the (indirectly) visible surfaces and further improving the quality of the final lighting estimation. For computing the virtual light emission profiles, we present a photon splitting technique that allows efficiently using a large number of photons, as it does not require storing them. During lighting estimation, our method allows using both global power-based and local BSDF important sampling techniques, combined via multiple importance sampling. In addition, we present an adaptive path extension method that avoids sampling nearby virtual lights for reducing the estimation error. We show that our method significantly outperforms path tracing and prior work in virtual lights in terms of both performance and image quality, producing a fast but biased estimate of global illumination.