EGPGV: Eurographics Workshop on Parallel Graphics and Visualization
Permanent URI for this community
Browse
Browsing EGPGV: Eurographics Workshop on Parallel Graphics and Visualization by Issue Date
Now showing 1 - 20 of 247
Results Per Page
Sort Options
Item A Multi-thread Safe Foundation for Scene Graphs and its Extension to Clusters(The Eurographics Association, 2002) Voß, G.; Behr, J.; Reiners, D.; Roth, M.; D. Bartz and X. Pueyo and E. ReinhardOne of the main shortcomings of current scene graphs is their inability to support multi-thread safe data structures. This work describes the general framework used by the OpenSG scene graph system to enable multiple concurrent threads to independently manipulate the scene graph without interfering with each other. Furthermore the extensions of the presented mechanisms needed to support cluster systems are discussed.Item Interactive Headlight Simulation - A Case Study of Interactive Distributed Ray Tracing -(The Eurographics Association, 2002) Benthin, Carsten; Dahmen, Tim; Wald, Ingo; Slusallek, Philipp; D. Bartz and X. Pueyo and E. ReinhardTodays rasterization graphics hardware provides impressive speed and features making it the standard tool for interactively visualising virtual prototypes early in the industrial design process. However, due to inherent limitations of the rasterization approach many optical effects can only be approximated. For many products, in particular in the car industry, the resulting visual quality and realism is inadequate as the basis for critical design decisions. Thus the original goal of using virtual prototyping - significantly reducing the number of costly physical mockups - often cannot be achieved. Interactive ray tracing on a small cluster of PCs is emerging as an alternative visualization technique achieving the required accuracy, quality, and realism. In a case study this paper demonstrates the advantages of using interactive ray tracing for a typical design situation in the car industry: visualizing the prototype of headlights. Due to the highly reflective and refractive nature of headlights, proper quality could only be achieved using a fast interactive ray tracing system.Item Out-Of-Core Sort-First Parallel Rendering for Cluster-Based Tiled Displays(The Eurographics Association, 2002) Correa, Wagner T.; Klosowski, James T.; Silva, Claudio T.; D. Bartz and X. Pueyo and E. ReinhardWe present a sort-first parallel system for out-of-core rendering of large models on cluster-based tiled displays. The system is able to render igh-resolution images of large models at interactive frame rates using off-theshelf PCs with small memory. Given a model, we use an out-of-core preprocessing algorithm to build an on-disk hierarchical representation for the model. At run time, each PC renders the image for a display tile, using an out-of-core rendering approach that employs multiple threads to overlap rendering, visibility computation, and disk operations. The system can operate in approximate mode for real-time rendering, or in conservative mode for rendering with guaranteed accuracy. Running our system in approximate mode on a cluster of 16 PCs each with 512 MB of main memory, we are able to render 12-megapixel images of a 13-million-triangle model with 99.3% of accuracy at 10.8 frames per second. Rendering such a large model at high resolutions and interactive frame rates would typically require expensive high-end graphics hardware. Our results show that a cluster of inexpensive PCs is an attractive alternative to those high-end systems.Item "Kilauea" - Parallel Global Illumination Renderer(The Eurographics Association, 2002) Kato, Toshi; Saito, Jun; D. Bartz and X. Pueyo and E. ReinhardKilauea is a revolutionary parallel renderer developed at Square USA. The goal of the R&D effort was to create a renderer which can compute global illumination on extremely complex scenes by using affordable PC cluster. This paper reports Kilauea s parallel processing methodology, implementation issues, and parallel performance.Item Distributed rendering of interactive soft shadows(The Eurographics Association, 2002) Isard, M.; Shand, M.; Heirich, A.; D. Bartz and X. Pueyo and E. ReinhardRecently several distributed rendering systems have been developed which exploit a cluster of commodity computers by connecting host graphics cards over a fast network to form a compositing pipeline. This paper introduces a new algorithm which takes advantage of the programmable compositing operators in these systems to improve the performance of rendering multiple shadow-maps, for example to produce approximate soft shadows. With an nVidia GeForce4 Ti graphics card the new algorithm reduces the number of required render nodes by nearly a factor of four compared with a naive approach. We show results that yield interactive-speed rendering of 32 shadows on a 9-node Sepia2a distributed rendering cluster.Item The Parallelization of the Perspective Shear-Warp Volume Rendering Algorithm(The Eurographics Association, 2002) Schulze, Jürgen P.; Lang, Ulrich; D. Bartz and X. Pueyo and E. ReinhardThe shear-warp algorithm for volume rendering is among the fastest volume rendering algorithms. It is an objectorder algorithm, based on the idea of the factorization of the view matrix into a 3D shear and a 2D warp component. Thus, the compositing can be done in sheared object space, which allows the algorithm to take advantage of data locality. Although the idea of a perspective projection shear-warp algorithm is not new, it is not widely used. That may be because it is slower than the parallel projection algorithm and often slower than hardware supported approaches. In this paper, we present a new parallelized version of the perspective shear-warp algorithm. The parallelized algorithm was designed for distributed memory machines using MPI. The new algorithm takes advantage of the idea that the warp can be done in most computers graphics hardware very fast, so that the remote parallel computer only needs to do the compositing. Our algorithm uses this idea to do the compositing on the remote machine, which transfers the resulting 2D intermediate image to the actual display machine. Even though the display machine could be a moderately equipped PC or laptop computer, it can be used to display complex volumetric data, provided there is a network connection to a high performance parallel computer. Furthermore, remote rendering could be used to drive virtual environments, which typically require perspective projection and high frame rates for stereo projection and multiple screens.Item Design and Implementation of A Large-scale Hybrid Distributed Graphics System(The Eurographics Association, 2002) Yang, Jian; Shi, Jiaoying; Jin, Zhefan; Zhang, Hui; D. Bartz and X. Pueyo and E. ReinhardAlthough modern graphics hardware has strong capability to render millions of triangles within a second, huge scenes are still unable to be rendered in real-time. Lots of parallel and distributed graphics systems are explored to solve this problem. However none of them is built for large-scale graphics applications. We designed AnyGL, a large-scale hybrid distributed graphics system, which consists of four types of logical nodes, Geometry Distributing Node, Geometry Rendering Node, Image Composition Node and Display Node. The first two types of logical nodes are combined to be a sort-first graphics architecture while the others compose images. A new state tracking method based on logical timestamp is also pro-posed for state tracking of large-scale distributed graphics systems. Besides, three classes of compression are employed to reduce the requirement of network bandwidth, including command code compression, geometry compression and image compression. A new extension, global share of textures and display lists, is also implemented in AnyGL to avoid memory explosion in large-scale cluster rendering systems.Item Physical cloth simulation on a PC cluster(The Eurographics Association, 2002) Zara, F.; Faure, F.; Vincent, J-M.; D. Bartz and X. Pueyo and E. ReinhardCloth simulation is of major interest in 3D animation, as it allows the realistic modeling of dressed humans. The goal of our work is to decrease computation time in order to obtain real time dynamics animation. This paper describes a cloth simulation and addresses the problem of parallelizing the implicit time integration and to couple a parallel execution with a standard visualization. We believe that this work could benefit to other applications based on a conjugate gradient solution and other applications of PC clusters.Item An Out-of-core Method for Computing Connectivities of Large Unstructured Meshes(The Eurographics Association, 2002) Ueng, Shyh-Kuang; Sikorski, K.; D. Bartz and X. Pueyo and E. ReinhardAdjacency graphs of meshes are important for visualizing or compressing unstructured scientific data. However, calculating adjacency graphs requires intensive memory space. For large data sets, the calculation becomes very inefficient on desk-top computers with limited main memory. In this article, an out-of-core method is presented for finding connectivities of large unstructured FEA data sets. Our algorithm composes of three stages. At the first stage, FEA cells are read into main memory in blocks. For each cell block read, cell faces are generated and distributed into disjoint groups. These groups are small enough such that each group can reside in main memory without causing any page swapping. The resulted groups are stored in disk files. At the second stage, the face groups are fetched into main memory and processed there one after another. Adjacency graph edges are determined in each face group by sorting faces and examining consecutive faces. The edges contained in a group are kept in a disk file. At the third stage, edge files are merged into a single file by using external merge sort, and the connectivity information is computed.Item Interactive Ray Tracing of Time Varying Data(The Eurographics Association, 2002) Reinhard, Erik; Hansen, Charles; Parker, Steve; D. Bartz and X. Pueyo and E. ReinhardWe present a simple and effective algorithm for ray tracing iso-surfaces of time varying data sets. Each time step is partitioned into separate ranges of potentional iso-surface values. This creates a large number of relatively small files. Out-of-core rendering is implemented by reading for each time step the relevant iso-surface file, which contains its own spatial subdivision as well as the volumetric data. Since any of these data partitions is smaller than a single time step, the I/O bottleneck is overcome. Our method capitalizes on the ability of modern architectures to stream data off disk without interference of the operating system. Additionally, only a fraction of a time-step is held in memory at any moment during the visualization, which significantly reduces the required amount of internal memory.Item Mining the Human Genome using Virtual Reality(The Eurographics Association, 2002) Stolk, Bram; Abdoelrahman, Faizal; Koning, Anton; Wielinga, Paul; D. Bartz and X. Pueyo and E. ReinhardThe analysis of genomic data and integration of diverse biological data sources has become increasingly difficult for researches in the life sciences. This problem is exacerbated by the speed with which new data is gathered through automated technology like DNA microarrays. We developed a virtual reality application for visualizing hierarchical relationships within a gene family and for visualizing networks of gene expression data. Integration of other information from multiple databases with these visualizations can aid pharmaceutical researchers in selecting target genes or proteins for new drugs. We found the application of virtual reality to the field of genomics to be successfull.Item An Efficient System for Collaboration in Tele-Immersive Environments(The Eurographics Association, 2002) Jensen, N.; Olbrich, S.; Pralle, H.; Raasch, S.; D. Bartz and X. Pueyo and E. ReinhardThe paper describes the development of a high-performance system for visualizing complex scientific models in real-time. The architecure of the system is a client/server model, in which the simulator generates lists of 3D graphics objects in parallel to the simulation, from where they are sent to a streaming server. The server transfers the 3D objects to viewer clients. Clients communicate over a second connection with each other, which adds the ability to perform collaborative tasks. An application related to computational fluid dynamics is specified where such a tele-immersive system can be used. The approach differs to other solutions because it offers a large set of graphics primitives for visualization, and it is optimized for distributed, heterogenous environments.Item An Interleaved Parallel Volume Renderer With PC-clusters(The Eurographics Association, 2002) Garcia, Antonio; Shen, Han-Wei; D. Bartz and X. Pueyo and E. ReinhardParallel Volume Rendering has been realized using various load distribution methods that subdivide either the screen, called image-space partitioning, or the volume dataset, called object-space partitioning. The major advantages of image-space partitioing are load balancing and low communication overhead, but processors require access to the full volume in order to render the volume with arbitrary views without frequent data redistributions. Subdividing the volume, on the other hand, provides storage scalability as more processors are added, but requires image compositing and thus higher communication bandwidth for producing the final image. In this paper, we present a parallel volume rendering algorithm that combines the benefits of both image-space and object-space partition schemes based on the idea of pixel and volume interleaving. We first subdivide the processors into groups. Each group is responsible for rendering a portion of the volume. Inside of a group, every member interleaves the data samples of the volume and the pixels of the screen. Interleaving the data provides storage scalability and interleaving the pixels reduces communication overhead. Our hybrid object- and image-space partitioning scheme was able to reduce the image compositing cost, incur in low communication overhead and balance rendering workload at the expense of image quality. Experiments on a PC-cluster demonstrate encouraging results.Item Approach for software development of parallel real-time VE systems on heterogenous clusters(The Eurographics Association, 2002) Winkelholz, C.; Alexander, T.; D. Bartz and X. Pueyo and E. ReinhardThis paper presents our approach for the development of software for parallel real-time virtual environment systems (VE) running on heterogenous clusters of computers. This approach is based on a framework we have developed to facilitate the set-up of immersive virtual environment systems using single components coupled by an isolated local network. The framework provides parallel rendering of multiple projection screens and parallel execution of application and interaction tasks on components spread across a cluster. Main concept of the approach discussed in this paper is to use the virtual reality modeling language (VRML) as an interface definition language (IDL) for the parallel and distributed virtual environment system. An IDL-compiler generates skeleton-code for the implementations of the script nodes specified in a VRML-file. Components created this way can be reused in any VE by declaring the same interfaces. Instances of the implemented interfaces can reside in any application. By this approach commercial-of-the-shelf software can easily be integrated into a VE application. In this connection we discuss the underlying framework and software development process. Furthermore, the implementation of a VE system for a geographic information system (GIS) based on this approach is shown. It is emphasized that the components are used in various different applications.Item Efficient Parallel Implementations for Surface Subdivision(The Eurographics Association, 2002) Padrón, E. J.; Amor, M.; Bóo, M.; Doallo, R.; D. Bartz and X. Pueyo and E. ReinhardAchieving an efficient surface subdivision is an important issue today in computer graphics, geometric modeling, and scientific visualization. In this paper we present two parallel versions of the Modified Butterfly algorithm. Both versions are based on a coarse-grain approach, that is, the original mesh is subdivided into small groups and each processor performs the triangles subdivision for a set of groups of the mesh. First approach sorts the groups in decreasing order of number of triangles per group, and then the sorted groups are cyclically distributed on the processors in order to achieve a good load distribution. In the second parallel version the processors can dynamically balance the work load by passing groups from heavier loaded processors to lighter ones, achieving in that way a better load balance. Finally, we evaluate the algorithms on two different systems: a SGI Origin 2000 and a Sun cluster. Good performances in terms of speedup have been obtained using both static and dynamic parallel implementations.Item Parallel Performance Optimization of Large-Scale Unstructured Data Visualization for the Earth Simulator(The Eurographics Association, 2002) Chen, L.; Fujishiro, I.; Nakajima, K.; D. Bartz and X. Pueyo and E. ReinhardThis paper describes some efficient parallel performance optimization strategies for large-scale unstructured data visualization on SMP cluster machines including the Earth Simulator in Japan. The three-level hybrid parallelization is employed in our implementation, consisting of message passing for inter-SMP node communication, loop directives by OpenMP for intra-SMP node parallelization, and vectorization for each processing element (PE). In order to improve the speedup performance for the hybrid parallelization, some techniques, such as multi-coloring for removing data race and dynamic load repartition for load balancing, are considered. Good visualization images and high parallel performance have been achieved on Hitachi SR8000 for large-scale unstructured datasets, which shows the feasibility and effectiveness of our strategies.Item Tuning of Algorithms for Independent Task Placement in the Context of Demand-Driven Parallel Ray Tracing(The Eurographics Association, 2004) Plachetka, T.; Dirk Bartz and Bruno Raffin and Han-Wei ShenThis paper investigates assignment strategies (load balancing algorithms) for process farms which solve the problem of online placement of a constant number of independent tasks with given, but unknown, time complexities onto a homogeneous network of processors with a given latency. Results for the chunking and factoring assignment strategies are summarised for a probabilistic model which models tasks' time complexities as realisations of a random variable with known mean and variance. Then a deterministic model is presented which requires the knowledge of the minimal and maximal tasks' complexities. While the goal in the probabilistic model is the minimisation of the expected makespan, the goal in the deterministic model is the minimisation of the worstcase makespan. We give a novel analysis of chunking and factoring for the deterministic model. In the context of demand-driven parallel ray tracing, tasks' time complexities are unfortunately unknown until the actual computation finishes. Therefore we propose automatic self-tuning procedures which estimate the missing information in run-time. We experimentally demonstrate for an "everyday ray tracing setting" that chunking does not perform much worse than factoring on up to 128 processors, if the parameters of these strategies are properly tuned. This may seem surprising. However, the experimentally measured efficiencies agree with our theoretical predictions.Item Massive Data Pre-Processing with a Cluster Based Approach(The Eurographics Association, 2004) Borgo, R.; Pascucci, V.; Scopigno, R.; Dirk Bartz and Bruno Raffin and Han-Wei ShenData coming from complex simulation models reach easily dimensions much greater than available computational resources. Visualization of such data still represents the most intuitive and effective tool for scientific inspection of simulated phenomena. To ease this process several techniques have been adopted mainly concerning the use of hierarchical multi-resolution representations. In this paper we present the implementation of a hierarchical indexing schema for multiresolution data tailored to overwork the computational power of distributed environments.Item A hierarchical and view dependent visualization algorithm for tree based AMR data in 2D or 3D(The Eurographics Association, 2004) Pino, Stéphane Del; Dirk Bartz and Bruno Raffin and Han-Wei ShenIn this paper, a solution to the visualization of huge amount of data provided by solvers using tree based AMR method is proposed. This approach strongly relies on the hierarchical structure of data and view dependent arguments: only the visible cells will be drawn, reducing consequently the amount of rendered data, selecting only the cells that intersect the screen and whose size is bigger than one pixel. After a brief statement of the problem, we recall the main principles of AMR methods.We then proceed to the data analysis which shows notable differences related to the dimension (2 or 3). A natural view dependent decimation algorithm is derived in the 2D case (only visible cells are plotted), while in 3D the treatment is not straightforward. The proposed solution relies then on the use of perspective in order to keep the same guidelines that were used in 2D. We then give a few hints about implementation and perform numerical experiments which confirm the efficiency of the proposed algorithms.We finally discuss this approach and give the sketch for future improvements.Item Memory-Savvy Distributed Interactive Ray Tracing(The Eurographics Association, 2004) DeMarle, David E.; Gribble, Christiaan P.; Parker, Steven G.; Dirk Bartz and Bruno Raffin and Han-Wei ShenInteractive ray tracing in a cluster environment requires paying close attention to the constraints of a loosely coupled distributed system. To render large scenes interactively, memory limits and network latency must be addressed efficiently. In this paper, we improve previous systems by moving to a page-based distributed shared memory layer, resulting in faster and easier access to a shared memory space. The technique is designed to take advantage of the large virtual memory space provided by 64-bit machines. We also examine task reuse through decentralized load balancing and primitive reorganization to complement the shared memory system. These techniques improve memory coherence and are valuable when physical memory is limited.