EGPGV13: Eurographics Symposium on Parallel Graphics and Visualization

Permanent URI for this collection

https://diglib.eg.org/handle/10.2312/354

Browse

Now showing 1 - 9 of 9

Analysis of Cache Behavior and Performance of Different BVH Memory Layouts for Tracing Incoherent Rays
(The Eurographics Association, 2013) Wodniok, Dominik; Schulz, Andre; Widmer, Sven; Goesele, Michael; Fabio Marton and Kenneth Moreland
With CPUs moving towards many-core architectures and GPUs becoming more general purpose architectures, path tracing can now be well parallelized on commodity hardware. While parallelization is trivial in theory, properties of real hardware make efficient parallelization difficult, especially when tracing incoherent rays. We investigate how different bounding volume hierarchy (BVH) and node memory layouts as well as storing the BVH in different memory areas impacts the ray tracing performance of a GPU path tracer. We optimize the BVH layout using information gathered in a pre-processing pass applying a number of different BVH reordering techniques. Depending on the memory area and scene complexity, we achieve moderate speedups.
GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting
(The Eurographics Association, 2013) Camp, David; Krishnan, Hari; Pugmire, David; Garth, Christoph; Johnson, Ian; Bethel, E. Wes; Joy, Kenneth I.; Childs, Hank; Fabio Marton and Kenneth Moreland
Although there has been significant research in GPU acceleration, both of parallel simulation codes (i.e., GPGPU) and of single GPU visualization and analysis algorithms, there has been relatively little research devoted to visualization and analysis algorithms on GPU clusters. This oversight is significant: parallel visualization and analysis algorithms have markedly different characteristics - computational load, memory access pattern, communication, idle time, etc. - than the other two categories. In this paper, we explore the benefits of GPU acceleration for particle advection in a parallel, distributed-memory setting. As performance properties can differ dramatically between particle advection use cases, our study operates over a variety of workloads, designed to reveal insights about underlying trends. This work has a three-fold aim: (1) to map a challenging visualization and analysis algorithm - particle advection - to a complex system (a cluster of GPUs), (2) to inform its performance characteristics, and (3) to evaluate the advantages and disadvantages of using the GPU. In our performance study, we identify which factors are and are not relevant for obtaining a speedup when using GPUs. In short, this study informs the following question: if faced with a parallel particle advection problem, should you implement the solution with CPUs, with GPUs, or does it not matter?
Image-parallel Ray Tracing using OpenGL Interception
(The Eurographics Association, 2013) Brownlee, Carson; Ize, Thiago; Hansen, Charles D.; Fabio Marton and Kenneth Moreland
CPU Ray tracing in scientific visualization has been shown to be an efficient rendering algorithm for large-scale polygonal data on distributed-memory systems by using custom integrations which modify the source code of existing visualization tools or by using OpenGL interception to run without source code modification to existing tools. Previous implementations in common visualization tools use existing data-parallel work distribution with sort-last compositing algorithms and exhibited sub-optimal performance scaling across multiple nodes due to the inefficiencies of data-parallel distributions of the scene geometry. This paper presents a solution which uses efficient ray tracing through OpenGL interception using an image-parallel work distribution implemented on top of the data-parallel distribution of the host program while supporting a paging system for access to non-resident data. Through a series of scaling studies, we show that using an image-parallel distribution often provides superior scaling performance which is more independent of the data distribution and view, while also supporting secondary rays for advanced rendering effects.
In Situ Pathtube Visualization with Explorable Images
(The Eurographics Association, 2013) Ye, Yucong; Miller, Robert; Ma, Kwan-Liu; Fabio Marton and Kenneth Moreland
In situ processing is considered to be the most plausible data analysis and visualization solution for extreme-scale simulations. Explorable images were introduced as an in situ visualization method to enable interactive exploration of scalar field data without need for access to the massive original data and a powerful computer. We present a technique for in situ generation of explorable images for the visualization of vector field data without incurring additional inter-processor communication during simulation. We demonstrate this technique for pathtube generation on a variety of large datasets. The resulting pathtube visualization succinctly captures the flow structure over the full time span of the simulation. Users may explore the vector field structure through the generated images by changing the view angle, generating block cutaways, adjusting lighting, or changing transfer functions to recolor pathtubes or provide partial transparency.
Practical Parallel Rendering of Detailed Neuron Simulations
(The Eurographics Association, 2013) Hernando, Juan B.; Biddiscombe, John; Bohara, Bidur; Eilemann, Stefan; Schürmann, Felix; Fabio Marton and Kenneth Moreland
Parallel rendering of large polygonal models with transparency is challenging due to the need for alpha-correct blending and compositing, which is costly for very large models with high depth complexity and spatial overlap. In this paper we compare the performance of raster-based rendering methods on mesh models of neurons using two applications, one of which is specifically tailored to the neuroscience application domain, the other a general purpose visualization tool with domain specific additions. The first implements both sort-first and sort-last and uses a scene graph style traversal to cull objects, and dual depth peeling for order independent transparency, whilst the other uses a simpler brute force data-parallel approach with sort last composition. The advantages and trade offs of these approaches are discussed. We present the optimized algorithms needed to achieve interactive frame rates for a non-trivial, real-world parallel rendering scenario. We show that a generic data visualization application can provide competitive performance when optimizing its rendering pipeline, with some loss of capability over an optimized domain-specific application.
Rendering Molecular Surfaces using Order-Independent Transparency
(The Eurographics Association, 2013) Kauker, Daniel; Krone, Michael; Panagiotidis, Alexandros; Reina, Guido; Ertl, Thomas; Fabio Marton and Kenneth Moreland
In this paper we present a technique for interactively rendering transparent molecular surfaces. We use Puxels, our implementation of per-pixel linked lists for order-independent transparency rendering. Furthermore, we evaluate the usage of per-pixel arrays as an alternative for this rendering technique. We describe our real-time rendering technique for transparent depiction of complex molecular surfaces like the Solvent Excluded Surface which is based on constructive solid geometry. Additionally, we explain further graphical operations and extensions possible with the Puxels approach. The evaluation benchmarks the performance of the presented methods and compares it to other methods.
Scalable Parallel Feature Extraction and Tracking for Large Time-varying 3D Volume Data
(The Eurographics Association, 2013) Wang, Yang; Yu, Hongfeng; Ma, Kwan-Liu; Fabio Marton and Kenneth Moreland
Large-scale time-varying volume data sets can take terabytes to petabytes of storage space to store and process. One promising approach is to process the data in parallel, and then extract and analyze only features of interest, reducing required memory space by several orders of magnitude for following visualization tasks. However, extracting volume features in parallel is a non-trivial task as features might span over multiple processors, and local partial features are only visible within their own processors. In this paper, we discuss how to generate and maintain connectivity information of features across different processors. Based on the connectivity information, partial features can be integrated, which makes it possible to extract and track features for large data in parallel. We demonstrate the effectiveness and scalability of our approach using two data sets with up to 16384 processors.
Scalable Seams for Gigapixel Panoramas
(The Eurographics Association, 2013) Philip, Sujin; Summa, Brian; Tierny, Julien; Bremer, Peer-Timo; Pascucci, Valerio; Fabio Marton and Kenneth Moreland
Gigapixel panoramas are an increasingly popular digital image application. They are often created as a mosaic of smaller images composited into a larger single image. The mosaic acquisition can occur over many hours causing the individual images to differ in exposure and lighting conditions. Therefore, to give the appearance of a single seamless image a blending operation is necessary. The quality of this blending depends on the magnitude of discontinuity along the boundaries between the images. Often image boundaries, or seams, are first computed to minimize this transition. Current techniques based on the multi-labeling Graph Cuts method are too slow and memory intensive for panoramas many gigapixels in size. In this paper we present a multithreaded out-of-core seam computing technique that is fast, has a small memory footprint, and gives near perfect scaling up to the number of physical cores of our test system. With this method the time required to compute image boundaries for gigapixel imagery improves from many hours (or even days) to just a few minutes on commodity hardware while still producing boundaries with energy that is on-par, if not better, than Graph Cuts.
VtkSMP: Task-based Parallel Operators for VTK Filters
(The Eurographics Association, 2013) Ettinger, Mathias; Broquedis, F.; Gautier, T.; Ploix, S.; Raffin, Bruno; Fabio Marton and Kenneth Moreland
NUMA nodes are potentially powerful but taking benefit of their capabilities is challenging due to their architecture (multiple computing cores, advanced memory hierarchy). They are nonetheless one of the key components to enable processing the ever growing amount of data produced by scientific simulations. In this paper we study the parallelization of patterns commonly used in VTK algorithms and propose a new multithreaded plugin for VTK that eases the development of parallel multi-core VTK filters. We specifically focus on task-based approaches and show that with a limited code refactoring effort we can take advantage of NUMA node capabilities. We experiment our patterns on a transform filter, base isosurface extraction filter and a min/max tree accelerated isosurface extraction. We support 3 programming environments, OpenMP, Intel TBB and X-KAAPI, and propose different algorithmic refinements according to the capabilities of the target environment. Results show that we can speed execution up to 30 times on a 48-core machine.

Browse

Browsing EGPGV13: Eurographics Symposium on Parallel Graphics and Visualization by Title

Results Per Page

Sort Options