EGGH03: SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003
Permanent URI for this collection
Browse
Browsing EGGH03: SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003 by Title
Now showing 1 - 13 of 13
Results Per Page
Sort Options
Item 3D Graphics LSI Core for Mobile Phone "Z3D"(The Eurographics Association, 2003) Kameyama, Masatoshi; Kato, Yoshiyuki; Fujimoto, Hitoshi; Negishi, Hiroyasu; Kodama, Yukio; Inoue, Yoshitsugu; Kawai, Hiroyuki; M. Doggett and W. Heidrich and W. Mark and A. SchillingIn this paper we describe the architecture of the 3D graphics LSI core for mobile phone "Z3D". The major 3D graphics applications on mobile phones are character animation and games. While a character animation or a game is running, the CPU has to be used for the communication to the center machine and CPU clock frequency is low. Therefore, the requirement of Z3D is small, low power, and CPU free. The pipeline of Z3D is composed of a geometry engine, rendering engine, and pixel engine. Generally, these modules run in pipeline on PC, but running in pipeline, power consumption rises. We used gated clock control for low power consumption and having the pipeline work in sequential, we could reduce 40% of the power consumption with 10% performance decreasing. The geometry processing performance of Z3D is up to 185Kvertex/sec and the pixel performance is 5Mpixel/sec. This performance is enough to have character animations and games run. The data of 3D shape and animation can be defined using common 3D modeler and contents program can be described by Java. Low level rendering interface and animation rendering interface are provided as a Java API. All the contents programs can be downloaded from network.Item Automatic Shader Level of Detail(The Eurographics Association, 2003) Olano, Marc; Kuehne, Bob; Simmons, Maryann; M. Doggett and W. Heidrich and W. Mark and A. SchillingCurrent graphics hardware can render procedurally shaded objects in real-time. However, due to resource and performance limitations, interactive shaders can not yet approach the complexity of shaders written for film production and software rendering, which may stretch to thousands of lines. These constraints limit not only the complexity of a single shader, but also the number of shaded objects that can be rendered at interactive rates. This problem has many similarities to the rendering of large models, the source of extensive research in geometric simplification and level of detail. We introduce an analogous process for shading : shader simplification. Starting from an initial detailed shader, shader simplification automatically produces a set of simplified shaders or a single new shader with extra level-of-detail parameters that control the shader execution. The resulting level-of-detail shader can automatically adjust its rendered appearance based on measures of distance, size, or importance, as well as physical limits such as rendering time budget or texture usage. We demonstrate shader simplification with a system that automatically creates shader levels of detail to reduce the number of texture accesses, one common limiting factor for current hardware.Item CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware(The Eurographics Association, 2003) Govindaraju, Naga K.; Redon, Stephane; Lin, Ming C.; Manocha, Dinesh; M. Doggett and W. Heidrich and W. Mark and A. SchillingWe present a novel approach for fast collision detection between multiple deformable and breakable objects in a large environment using graphics hardware. Our algorithm takes into account low bandwidth to and from the graphics cards and computes a potentially colliding set (PCS) using visibility queries. It involves no precomputation and proceeds in multiple stages: PCS computation at an object level and PCS computation at sub-object level, followed by exact collision detection. We use a linear time two-pass rendering algorithm to compute each PCS efficiently. The overall approach makes no assumption about the input primitives or the object's motion and is directly applicable to all triangulated models. It has been implemented on a PC with NVIDIA GeForce FX 5800 Ultra graphics card and applied to different environments composed of a high number of moving objects with tens of thousands of triangles. It is able to compute all the overlapping primitives between different objects up to image-space resolution in a few milliseconds.Item An Effective Hardware Architecture for Bump Mapping Using Angular Operation(The Eurographics Association, 2003) Lee, S. G.; Park, W. C.; Lee, W. J.; Han, T. D.; Yang, S. B.; M. Doggett and W. Heidrich and W. Mark and A. SchillingIn this paper, we propose an effective bump mapping algorithm that utilizes the reference space with the polar coordinate system and also propose a new hardware architecture associated with the proposed bump mapping algorithm. The proposed architecture reduces the computations to transform the vectors from the object space into the reference space by using a new vector rotation method. It also reduces the computations for the illumination calculation by using the law of cosine. Compared with the previous approaches, the proposed architecture reduces multiplication operations up to 78%.Item The FFT on a GPU(The Eurographics Association, 2003) Moreland, Kenneth; Angel, Edward; M. Doggett and W. Heidrich and W. Mark and A. SchillingThe Fourier transform is a well known and widely used tool in many scientific and engineering fields. The Fourier transform is essential for many image processing techniques, including filtering, manip- ulation, correction, and compression. As such, the computer graphics community could benefit greatly from such a tool if it were part of the graphics pipeline. As of late, computer graphics hardware has become amazingly cheap, powerful, and flexible. This paper describes how to utilize the current gener- ation of cards to perform the fast Fourier transform (FFT) directly on the cards. We demonstrate a system that can synthesize an image by conventional means, perform the FFT, filter the image, and finally apply the inverse FFT in well under 1 second for a 512 by 512 image. This work paves the way for performing complicated, real-time image processing as part of the rendering pipeline.Item GPU Algorithms for Radiosity and Subsurface Scattering(The Eurographics Association, 2003) Carr, Nathan A.; Hall, Jesse D.; Hart, John C.; M. Doggett and W. Heidrich and W. Mark and A. SchillingWe capitalize on recent advances in modern programmable graphics hardware, originally designed to support advanced local illumination models for shading, to instead perform two different kinds of global illumination models for light transport. We first use the new floating-point texture map formats to find matrix radiosity solutions for light transport in a diffuse environment, and use this example to investigate the differences between GPU and CPU performance on matrix operations. We then examine multiple-scattering subsurface light transport, which can be modeled to resemble a single radiosity gathering step. We use a multiresolution meshed atlas to organize a hierarchy of precomputed subsurface links, and devise a three-pass GPU algorithm to render in real time the subsurface-scattered illumination of an object, with dynamic lighting and viewing.Item Mesh Mutation in Programmable Graphics Hardware(The Eurographics Association, 2003) Shiue, Le-Jeng; Goel, Vineet; Peters, Jorg; M. Doggett and W. Heidrich and W. Mark and A. SchillingWe show how a future graphics processor unit (GPU), enhanced with random read and write to video memory, can represent, refine and adjust complex meshes arising in modeling, simulation and animation. To leverage SIMD parallelism, a general model based on the mesh atlas is developed and a particular implementation without adjacency pointers is proposed in which primal, binary refinement of, possibly mixed, quadrilateral and triangular meshes of arbitrary topological genus, as well as their traversal is supported by user-transparent programmable graphics hardware. Adjustment, such as subdivision smoothing rules, is realized as user-programmable mesh shader routines. Attributes are generic and can be defined in the graphics application by binding them to one of several general addressing mechanisms.Item A Multigrid Solver for Boundary Value Problems Using Programmable Graphics Hardware(The Eurographics Association, 2003) Goodnight, Nolan; Woolley, Cliff; Lewin, Gregory; Luebke, David; Humphreys, Greg; M. Doggett and W. Heidrich and W. Mark and A. SchillingWe present a case study in the application of graphics hardware to general-purpose numeric computing. Specifi- cally, we describe a system, built on programmable graphics hardware, able to solve a variety of partial differential equations with complex boundary conditions. Many areas of graphics, simulation, and computational science require efficient techniques for solving such equations. Our system implements the multigrid method, a fast and popular approach to solving large boundary value problems. We demonstrate the viability of this technique by using it to accelerate three applications: simulation of heat transfer, modeling of fluid mechanics, and tone mapping of high dynamic range images. We analyze the performance of our solver and discuss several issues, including techniques for improving the computational efficiency of iterative grid-based computations for the GPU.Item An Optimized Soft Shadow Volume Algorithm with Real-Time Performance(The Eurographics Association, 2003) Assarsson, Ulf; Dougherty, Michael; Mounier, Michael; Akenine-Möller, Tomas; M. Doggett and W. Heidrich and W. Mark and A. SchillingIn this paper, we present several optimizations to our previously presented soft shadow volume algorithm. Our optimizations include tighter wedges, heavily optimized pixel shader code for both rectangular and spherical light sources, a frame buffer blending technique to overcome the limitation of 8-bit frame buffers, and a simple culling algorithm. These together give real-time performance, and for simple models we get frame rates of over 150 fps. For more complex models 50 fps is normal. In addition to optimizations, two simple techniques for improving the visual quality are also presented.Item Photon Mapping on Programmable Graphics Hardware(The Eurographics Association, 2003) Purcell, Timothy J.; Donner, Craig; Cammarano, Mike; Jensen, Henrik Wann; Hanrahan, Pat; M. Doggett and W. Heidrich and W. Mark and A. SchillingWe present a modified photon mapping algorithm capable of running entirely on GPUs. Our implementation uses breadth-first photon tracing to distribute photons using the GPU. The photons are stored in a grid-based photon map that is constructed directly on the graphics hardware using one of two methods: the first method is a multipass technique that uses fragment programs to directly sort the photons into a compact grid. The second method uses a single rendering pass combining a vertex program and the stencil buffer to route photons to their respective grid cells, producing an approximate photon map. We also present an efficient method for locating the nearest photons in the grid, which makes it possible to compute an estimate of the radiance at any surface location in the scene. Finally, we describe a breadth-first stochastic ray tracer that uses the photon map to simulate full global illumination directly on the graphics hardware. Our implementation demonstrates that current graphics hardware is capable of fully simulating global illumination with progressive, interactive feedback to the user.Item Simulation of Cloud Dynamics on Graphics Hardware(The Eurographics Association, 2003) Harris, Mark J.; III, William V. Baxter; Scheuermann, Thorsten; Lastra, Anselmo; M. Doggett and W. Heidrich and W. Mark and A. SchillingThis paper presents a physically-based, visually-realistic interactive cloud simulation. Clouds in our system are modeled using partial differential equations describing fluid motion, thermodynamic processes, buoyant forces, and water phase transitions. We also simulate the interaction of clouds with light, including self-shadowing and light scattering. We implement both simulations - dynamic and radiometric - entirely on programmable floating-point graphics hardware. We use "flat 3D textures" - 3D data laid out as slices tiled in a 2D texture - to implement 3D simulations on the GPU. This has scalability advantages over the use of traditional 3D textures. We exploit the relatively slow evolution of clouds in calm skies to enable interactive visualization of the simulation. The work required to simulate a single time step is automatically spread over many frames while the user views the results of the previous time step. This technique enables the incorporation of our simulation into real applications without sacrificing interactivity.Item Texture Compression using Low-Frequency Signal Modulation(The Eurographics Association, 2003) Fenney, Simon; M. Doggett and W. Heidrich and W. Mark and A. SchillingThis paper presents a new, lossy texture compression technique that is suited to implementation on low-cost, low-bandwidth devices as well as more powerful rendering systems. It uses a representation that is based on the blending of two (or more) `low frequency' signals using a high frequency but low precision modulation signal. Continuity of the low frequency signals helps to avoid block artefacts. Decompression costs are kept low through use of fixed-rate encoding and by eliminating indirect data access, as needed with Vector Quantisation schemes. Good quality reproduction of (A)RGB textures is achieved with a choice of 4bpp or 2bpp representations.Item VoxelCache: A Cache-Based Memory Architecture for Volume Graphics(The Eurographics Association, 2003) Kanus, U.; Wetekam, G.; Hirche, J.; M. Doggett and W. Heidrich and W. Mark and A. SchillingThis paper presents a cache-based memory architecture for volume graphics. We describe the memory organization and cache logic to implement a voxel cache based on 43 voxel blocks. We show an efficient prefetching scheme that increases the cache hit ratio to more than 98% in most cases. The performance of the memory system with different types of external memory is demonstrated by a cycle accurate C++ simulation. The VoxelCache memory architecture is designed to be easily adapted to different memory technologies, because all volume graphics specific parts of the memory system are encapsulated inside the on-chip cache. The design is targeted at implementation on off-the-shelf reconfigurable hardware.