3D Object Oriented Graphics

None of the many standard 3D graphics libraries ( like GL or PHIGS ) considered for this project proved to be usable. None of them, for example, can handle text objects in a meaningful way. They require a separate system to render text and pictures into a bitmap which can then be used as a texture for a 3D model. This is hardly practical in an interactive application. Performance turned out to be another great limitation. With fast enough CPUs or dedicated graphics accelerators most of these libraries even work in real time. But none of them are scalable - able to work on low end machines or to really make full use of future computers.

So a new technology had to be developed for Merlin: Adaptive Rendering. This is described below and generates real-time images from the following objects:

The heart of the Adaptive Rendering technology is the "Reverse Painter's Algorithm". In the normal painter's algorithm, all objects are transformed into screen coordinates and then sorted based on the z dimension (the distance to the viewer). Then the furthest objects are painted in order until the closest is painted last, on top of all previous ones. The name alludes to the fact that a painter might start with the sky, then do the mountains, follow that with the closer trees and then add the people last of all. Any given area of the canvas might get painted on several times, but it doesn't matter for the final results looks right. A popular alternative is the z-buffer algorithm, where the distance of the closest object painted so far at each pixel is stored in a memory area called the "z-buffer". Each time a pixel should be painted, the current distance is compared with that stored in the buffer and only if it is smaller is the screen (a z-buffer) changed. Objects can then be painted in any order and the results will be the same, at the cost of extra memory and per-pixel comparisons.

In the reverse painter's algorithm, objects are painted from the closest to the furthest but each pixel must be checked that it hasn't been previously painted before it is allowed to change. This seems to combine the worst features of the painter's algorithm with the z-buffer algorithm (which might explain why it has never been very popular) but that is only true of a naive implementation and if each frame in an animation (or interactive application) is rendered separately.

A more practical implementation is based on the notion of "damage regions", which is widely used in 2D GUIs. A region can be represented by a bitmap, a set of rectangles or, in this case, by a list of trapezoids. Each trapezoid is horizontally aligned and can be described by six numbers: y1, x1start, x2stop, y2, x2start and x2stop. It is possible to add two regions to each other, subtract one from the other or intersect two regions to obtain a new one. In the general case, this "region math" is rather costly.

The reverse painter's algorithm with regions goes something like this:

  r: screen asRegion.
  p: scene allPolygonsSortedFrontToBack.
  [r isEmpty or: p atEnd] whileFalse: [
     poly: p next.
     toDraw: r intersect: poly asRegion.
     screen paint: poly In: toDraw.
     r: r - toDraw.
  r isEmpty ifFalse: [screen paint: background In: r].

Here the check for what has been painted before is done on a region basis, not pixel by pixel. Even so, a paint operation involved a setup time and an execution time so that replacing one big 'paint:In:' used in the z-buffer or normal painter's algorithm by several smaller ones might actually slow rendering down even if the total number of pixels painted is greatly reduced. And there is the extra cost of region math, already mentioned above. But things change significantly when we considering rendering a second frame of the same scene. Instead of initializing 'r' to include the whole screen, we could start with a region that is the union of the position occupied in the previous frame by all polygons that have changed (moved, changed color, etc.) plus the positions of these same polygons in the current frame. In the typical case (for a 3D GUI, not something like a flight simulator) this second frame (and all the following ones for some time) will be rendered in a fraction of the time needed for the first one, and so outperforming the other algorithms by a significant margin.

The Warping Renderer

With Adaptive Rendering, the above process is actually repeated several times with different implementations of the 'paint:In:' method. While the loop shown above is chipping away at the damage region 'r', the user might be interacting with objects and cause new areas to be added to it (when this happens, the polygon list 'p' must be reset to the beginning). This interplay between the rendering consuming the damage region and the GUI adding to it means that 'r' grows or shrinks depending on the CPUs processing power, current workload and whether the objects the user sees are changing or not.

The first renderer in the system is called the "warping renderer" because it paints its polygons as an affine transformation of the pixels present in the previous frame. If the polygon contains new pixels (it was previously covered by another, for example) then that area of the screen is simply painted with a solid color. This tends to "break up" the scene by accumulating two kinds of errors in the rendered images as time goes on, but since people are not good at seeing details in rapidly changing views it isn't too much of a problem (and this allows even a 386 machine with a dumb frame buffer and a loaded CPU to get real time results). Whenever the warping renderer actually gets to the end of its loop completely emptying its damage region, it calls the next renderer (called the Simple Scanline Renderer) with a damage region that is the union of all the areas it redrew since the last time it called it. The damage region for all subsequent renderers have each region tagged with the polygon that was drawn in it, so only the first renderer must actually loop through the sorted polygon list.

The Simple Scanline Renderer

The second renderer in the adaptive chain is the traditional backend of the 3D pipeline. It breaks up the trapezoids to be painted into a set to scanlines and then "walks" each one after transforming into texture coordinates. Finding the color for the pixels is a simple table lookup in some cases, but most of the time it is much more involved as an arbitrary set of Morphs can be used as the polygon's (actually the bodyMorph's) "texture". Once the morph corresponding to the pixel is found, that morph's shader is queried for a simple color. So the scanline renderer effectively generates an accurate, but flat shaded, scene.

The simplest shader is simply the paint object from Self 4.0. Most objects will use that one. If a more sophisticated shader is found when painting a polygon, that region is added to the damage region for the next (possibly even the next two) renderer in the chain.

The Shader Renderer

The next stop is the Shader Renderer. It works exactly as the previous ones except that when it finds a shader it gives it all the information it needs to calculate the illumination models, bump mapping, etc. Note that this might take a while, but that this renderer is never called for the full screen but only for special objects.

The Ray Tracer

The fourth renderer is simply an implementation of the ray tracing algorithm. Like the previous renderer, this one is only invoked for objects that really need it (which have transparency or reflections). Unlike a normal ray tracer where the first step is finding out which object intersects the ray corresponding to each pixel, that object (really polygon) is already known due to the work of the scanline renderer.

Scalable Performance

On a very low end machine, the GUI will still feel very snappy even though it will be "messed up" when the user is changing things. When the scene stops changing, the ugly areas will slowly start being cleaned up. If the user is patient enough, even the ray tracer will have a chance to kick in and create an image that is a joy to behold. The exact same software would have a very different effect on a future high end machine (a quad CPU 64 bit machine, for example) - the warping renderer would always win its tug-of-war with the mouse and call the scanline renderer for every single frame. On a fast enough machine, even the shader renderer might start working full time. Someday real time ray tracing might become a possibility - this graphics library is ready for that.

A previous design of the graphics model was very similar to a software-only implementation of Microsoft's proposed Talisman. It needed separate rendering buffers for each object and then would compose these on the screen as a separate step. This would waste too much memory on low end systems, so the current design limits itself to the frame buffer memory only. It does not even suppose that there is enough memory on the graphics card for double buffering, even though that would greatly simplify the warping renderer.

see also:
| persist | | reflect | | selfdiff | | selfuse | | gmodel | | gui |
back to:
| intro |
| merlin | | LSI | | USP |
please send comments to jecel@lsi.usp.br (Jecel Mattos de Assumpcao Jr)