High-Performance Graphics | EGGH: SIGGRAPH/Eurographics Workshop on Graphics Hardware

Permanent URI for this community

https://diglib.eg.org/handle/10.2312/322

Browse

Now showing 1 - 12 of 12

Antialiased Parameterized Solid Texturing Simplified for Consumer- Level Hardware Implementation
(The Eurographics Association, 1999) Hart, John C.; Carr, Nate; Karneya, Masaki; Tibbitts, Stephen A.; Coleman, Terrance J.; A. Kaufmann and W. Strasser and S. Molnar and B.- O. Schneider
Procedural solid texturing was introduced fourteen years ago, but has yet to find its way into consumer level graphics hardware for teal-time operation. To this end, a new model is introduced that yields a parameterized function capable of synthesizing the most common procedural solid textures, specifically wood, marble, clouds and fire. This model is simple enough to be implemented in hardware, and can be realized in VLSI with as little as 100,000 gates. The new model also yields a new method for antialiasing synthesized textures. An expression for the necessary box filter width is derived as a function of the texturing parameters, the texture coordinates and the rasterization variables. Given this filter width, a technique for efficiently box filtering the synthesized texture by either mip mapping the color table or using a summed area color table are presented. Examples of the antialiased results are shown.
Design of a Fast Voxel Processor for Parallel Volume Visualization
(The Eurographics Association, 1995) Lichtennann, Jan; W. Strasser
The basics of a parallel real-time volume visualization architecture are introduced. Volume data is divided into subcubes that are dis tributed among multiple image processors and stored in their pri vate voxel memories. Rays fall into ray segments at the subcube borders. Each image processor is responsible for the ray segments within its assigned subcubes. Results of the ray segments are passed to the image processor where the ray continues. The enu meration of resampling points on the ray segments and the interpo lation at resampling points is accelerated by the voxel processor. The voxel processor can additionally compute a normalized gradi ent vector at a resampling point used as a surface normal estima tion for shading calculations. In the paper the focus is on operation and hardware implementation of this pipeline processor and the organization of voxel memory. The instruction set of the voxel pro cessor is explained. A performance of 20 images per second for a 2563 voxel volume and 16 image processors can be achieved.
EM-Cube: An Architecture for Low-Cost Real-Time Volume Rendering
(The Eurographics Association, 1997) Osborne, Rändy; Pfister, Hanspeter; Lauer, Hugh; McKenzie, Neil; Gibson, Sarah; Hiatt, Wally; Ohkarni, TakaHide; A. Kaufmann and W. Strasser and S. Molnar and B.-O. Schneider
EM-Cube is a VLSI architecture for low-cost, high quality volume rendering at full video frame rates. Derived from the Cube4 architecture developed at SUNY at Stony Brook, EM-Cube computes sample points and gradients on-the-fly to project 3-dimensional volume data onto 2-dimensional images with realistic lighting and shading. A modest rendering system based on EM-Cube consists of a PC1 card with four rendering chips (ASICs), four 64Mbit SDRAMs to hold the volume data, and four SRAMs to capture the rendered image. The performance target for this configuration is to render images from a 2563 x 16 bit data set at 30 frames/sec. The EM-Cube architecture can be scaled to larger volume data-sets and/or higher frame rates by adding additional ASKS, SDRAMs, and SRAMs. This paper addresses three major challenges encountered developing EM-Cube into a practical product: exploiting the bandwidth inherent in the SDRAMs containing the volume data, keeping the pin-count between adjacent ASICs at a tractable level, and reducing the on-chip storage required to hold the intermediate results of rendering.
High-Quality Volume Rendering Using Texture Mapping Hardware
(The Eurographics Association, 1998) Dachille, Frank; Kreeger, Kevin; Chen, Baoquan; Bitter, Ingmar; Kaufman, Arie; S. N. Spencer
We present a method Jor volume rendering of regular grids which takes advantage of 3D texture mapping hardware currently, available on graphics workstations. Our method products accurate shading for arbitrary and dynamically changing directional lights, viewing parameters, and transfer functions. This is achieved by hardware interpolating the data values and gradients before software classification and shading. The method works equally well for parallel and perspective projections. We present two approaches for OUT method: one which takes advantage of software ray casting optimizations and another which takes advantage of hardware blending acceleration.
Hybrid Volume and Polygon Rendering with Cube Hardware
(The Eurographics Association, 1999) Kreeger, Kevin; Kaufman, Arie; A. Kaufmann and W. Strasser and S. Molnar and B.- O. Schneider
We present two methods which connect today s polygon graphics hardware accelerators to Cube-5 volume rendering hardware, the successor to Cube4 The proposed methods allow mixing of both opaque and translucent polygons with volumes on PC class machines, while ensuring the correct compositing order of all objects. Both implementations connect the two hardware acceleration subsystems at the frame buffer. One shares a common DRAM buffer and one run-length encodes images of thin slabs of polygonal data and then combines them in the Cube composite buffer In both realizations, we take advantage of the predictable ordered access to frame buffer storage that is utilized by Cube-5 and the rest of the family of volume rendering accelerators based on the Cube design.
Memory Access Patterns of Occlusion-Compatible 3D Image Warping
(The Eurographics Association, 1997) Murk, William R.; Bishop, Gary; A. Kaufmann and W. Strasser and S. Molnar and B.-O. Schneider
McMillan and Bishop s 3D image warp can be efficiently implemented by exploiting the coherency of its memory accesses. We analyze this coherency, and present algorithms that take advantage of it. These algorithms traverse the reference image in an occlusion-compatible order, which is an order that can resolve visibility using a painter s algorithm. Required cache sizes are calculated for several one-pass 3D warp algorithms, and we develop a two-pass algorithm which requires a smaller cache size than any of the practical one-pass algorithms. We also show that reference image traversal orders that are occlusion-compatible for continuous images are not always occlusion-compatible when applied to the discrete images used in practice.
Neon: A Single-Chip 3D Workstation Graphics Accelerator
(The Eurographics Association, 1998) McCormack, Joel; McNamara, Robert; Gianos, Christopher; Seiler, Larry; Jouppi, Norman P.; Correll, Ken; S. N. Spencer
High-performance 3D graphics accelerators traditionally require multiple chips on multiple boards, including geometry, rasterizing, pixel processing, and texture mapping chips. These designs are often scalable: they can increase performance by using more chips. Scalability has obvious costs: a minimal configuration needs several chips, and some configurations must replicate texture maps. A less obvious cost is the almost irresistible temptation to replicate chips to increase performance, rather than to design individual chips for higher performance in the first place. In contrast, Neon is a single chip that performs like a multichip design. Neon accelerates OpenGL [19] 3D rendering, as well as X11 [20] and Windows/NT 2D rendering. Since our pin budget limited peak memory bandwidth, we designed Neon from the memory system upward in order to reduce bandwidth requirements. Neon has no special-purpose memories; its eight independent 32-bit memory controllers can access color buffers, 1. depth buffers, stencil buffers, and texture data. To fit our gate budget, we shared logic among different operations with similar implementation requirements, and left floating point calculations to Digital s Alpha CPUs. Neon s performance is between HP s Visualize fx4 and fx6, and is well above SGI s MXE for most operations. Neon-based boards cost much less than these competitors, due to a small part count and use of commodity SDRAMs.
Optimal Depth Buffer for Low-Cost Graphics Hardware
(The Eurographics Association, 1999) Lapidous, Eugene; Jiao, Guofang; A. Kaufmann and W. Strasser and S. Molnar and B.- O. Schneider
3D applications using hardware depth buffers for visibility testing are confronted with multiple choices of buffer types, sizes and formats. Some of the options are not exposed through 3D API or may be used by the driver without application s knowledge. As a result, it becomes increasingly difficult to select depth buffer optimal for desired balance between performance and precision. In this paper we provide comparative evaluation of depth precision for main depth buffer types with different size and format combinations. Results indicate that integer storage is preferred for some buffer types, while others achieve maximal depth resolution with floating-point format optimized for known scene parameters. We propose to give 3D applications full control of the depth buffer optimization by supporting multiple storage formats with the same buffer size and exposing them in 3D API. In the search for a unified depth buffer solution, we describe new type of the depth buffer and compare it with other options. Complementary floating-point Z buffer is a combination of a reversed-direction Z buffer and an optimal floating-point storage format. Non-linear mapping and storage format compensate each other s effect on the depth precision; as a result, depth errors become significantly less dependent on the eye-space distance, improving depth resolution by the orders of magnitude in comparison with standard Z buffer. Results show that complementary Z buffer is also superior to inverse W buffer at any storage size. At 16 and 24 bits/pixel, average depth errors of complementary Z buffer remain 2 times larger than for true W buffer utilizing expensive high-precision per-pixel division. However, it provides absolutely best precision at 32 bits/pixel, when errors are limited by floating-point per-vertex input. Results suggest that complementary floating-point Z buffer can be considered as a candidate for replacement of both screen Z and inverse W buffers, at the same time making hardware investment in the true W buffer support less attractive.
Parallel Texture Caching
(The Eurographics Association, 1999) lgehy, Homan; Eldridge, Matthew; Hanrahan, Pat; A. Kaufmann and W. Strasser and S. Molnar and B.- O. Schneider
The creation of high-quality images requires new functionality and higher performance in real-time graphics architectures. In terms of functionality, texture mapping has become an integral component of graphics systems, and in terms of performance, parallel techniques are used at all stages of the graphics pipeline. In rasterization, texture caching has become prevalent for reducing texture bandwidth requirements. However, parallel rasterization architectures divide work across multiple functional units, thus potentially decreasing the locality of texture references. For such architectures to scale well, it is necessary to develop efficient parallel texture caching subsystems. We quantify the effects of parallel rasterization on texture locality for a number of rasterization architectures, representing both current commercial products and proposed future architectures. A cycle-accurate simulation of the rasterization system demonstrates the parallel speedup obtained by these systems and quantities inefficiencies due to redundant work, inherent parallel load imbalance, insufftcient memory bandwidth, and resource contention. We find that parallel texture caching works well, and is general enough to work with a wide variety of rasterization architectures.
PixelFlow: The Realization
(The Eurographics Association, 1997) Eyles, John; Molnar, Steven; Poulton, John; Greer, Trey; Lastra, Anselmo; England, Nick; Westover, Lee; A. Kaufmann and W. Strasser and S. Molnar and B.-O. Schneider
PixelFlow is an architecture for high-speed, highly realistic image generation, based on the techniques of object-parallelism and image composition. Its initial architecture was described in [MOLN92]. After development by the original team of researchers at the University of North Carolina, and codevelopment with industry partners, Division Ltd. and Hewlett- Packard, PixelFlow now is a much more capable system than initially conceived and its hardware and software systems have evolved considerably. This paper describes the final realization of PixelFlow, along with hardware and software enhancements heretofore unpublished.
Texture Shaders
(The Eurographics Association, 1999) McCool, Michael D.; Heidrich, Wolfgang; A. Kaufmann and W. Strasser and S. Molnar and B.- O. Schneider
Extensions to the texture-mapping support of the abstract graphics hardware pipeline and the OpenGL API are proposed to better support programmable shading, with a unified interface, on a variety of future graphics accelerator architectures. Our main proposals include better support for texture map coordinate generation and an abstract, programmable model for multitexturing. As motivation, we survey several interactive rendering algorithms that target important visual phenomena. With hardware implementation of programmable multitexturing support, implementations of these effects that currently take multiple passes can be rendered in one pass. The generality of our proposed extensions enable efficient implementation of a wide range of other interactive rendering algorithms. The intermediate level of abstraction of our API proposal enables high-level shader metaprogramming toolkits and relatively straightforward implementations, while hiding the details of multitexturing support that are currently fragmenting OpenGL into incompatible dialects.
Triangle Scan Conversion using 2D Homogeneous Coordinates
(The Eurographics Association, 1997) Olano, Marc; Greer, Trey; A. Kaufmann and W. Strasser and S. Molnar and B.-O. Schneider
We present a new triangle scan conversion algorithm that works entirely in homogeneous coordinates. By using homogeneous coordinates, the algorithm avoids costly clipping tests which make pipelining or hardware implementations of previous scan conversion algorithms difficult. The algorithm handles clipping by the addition of clip edges, without the need to actually split the clipped triangle. Furthermore, the algorithm can render true homogeneous triangles, including external triangles that should pass through infinity with two visible sections. An implementation of the algorithm on Pixel-Planes 5 runs about 33% faster than a similar implementation of the previous algorithm.

Browse

Browsing High-Performance Graphics | EGGH: SIGGRAPH/Eurographics Workshop on Graphics Hardware by Subject "1.3.7 [Computer Graphics]"

Results Per Page

Sort Options