Renderer design

DavidYazel · July 6, 2003, 1:46am

I wasn’t sure where to post this, so forgive me if this is the wrong place.

I have completed my first pass at a renderer traversing a Java3d standard scenegraph. Not all nodes are implemented, but most are. This toy renderer traverse the scene and uses LWJGL opengl calls in place. Simple scenes are very fast and as expected this drops off fast as the complexity increases.

Now that things seem to be working smoothly I am designing a better rendering pipeline and wanted to see if it seems to make sense.

One of the things you gain when traversing a scenegraph is that you can use the opengl matrix stack to handle your transformations. The problem with that the scenegraph is really there to support the logical view of the data and is not optimized for rendering. I need to re-order the way that geometry is sent to the card so that I can minimize state transfers.

So my basic idea is to do the following:

Traverse the scene and for transform groups I would calculate the local to vworld transform, multiplying them down the tree. I would only recalculate this if the transform or a parent transform has been marked dirty through a change since the last render.
As I got to nodes I would translate their bounding spheres using the nodes local to vworld transform. This would shift the node bounds in the SphereTree (extremely efficient for making changes).
Traverse the sphere tree and perform frustum culling, skipping any nodes which are outside frustum planes. Keep track via stacks the current states. So if I hit a fog node that is good from that location down, I would push the fog node on the fog stack. When I come back up through the node I would pop it off.
When I got to a node that is to be rendered I would package it up with its shaders ( current state of appearance + fog + lights, etc). This would push a package on the render bin which had everything needed to render.

Some nodes can also be view specific geometry generators, including BSP generators which can produce their own packages for rendering. These would be triggered and they would produce shader+geometry packages into the render bin.

Special nodes called shadow occluders can be added by users into a branch group to represent the occluder for the “model” which exists below it in the scenegraph. These are pushed into a special queue in the renderer. The geometry can be supplied for the occluder, or it can be built on the fly (and cached) from all the geometry which exist in descending nodes.
Some of the packages which are placed into the render bin can contain other packages. Examples of this are decal groups and order groups.
The render bin is sorted by shader constitutions to minimize the state changes. Transparent shaders and thier nodes are placed into a second bin for a later pass.
Render bins are submitted to the geometry pipeline which: 1) steps through each package, changes only the delta state (including doing a glLoadMatrix for the transform if there is a change) 2) manages vertex arrays + texture bindings 3) renders geomtry.
The first pass would be the opaqe render bin, then each package container (ordered groups, decal groups) would be sorted and sent to the geometry pipeline.
Shadows would rendered using zbuffer shadow volumes (carmacks reverse zpass/fail)
The transparent package bin is then processed. Specially flagged packages would be combined and broken up by polygons into mini-chunks. Regular packages will be added in their larger chunks. This bin is then sorted by distance to the view and submitted to the geometry pipeline.
Last step would be to render overlays.

I also plan to add more advanced shaders than just the standard appearance. Packages flagged for dynamic per-pixel lighting can have their vertex colors encoded on the fly with the light vector so we apply 3d lightmaps + bump + cubemap normalizer, etc.

I am designing this to be generic as possible with plugins into the rendering pipeline and graphics pipeline to allow different techniques to be used. It should be also possible to add in a occlusion culling plugin so that different techniques (e.g. Zhang’s hierarchical z pyramid) for late pipeline checking or even an early pipline PVS check.

Are there any glaring holes in the basic outline I have described?

Once this engine is done (good enough to run Magicosm on) I will release it for any use and eventually open it up for other developers to contribute. At worst it should be an example others can learn good and bad things from.

kevglass · July 6, 2003, 4:44am

From the little I know, I read through, and nothing jumped out.

Just like to say thanks for going to this trouble, and better still keeping the “community” involved. Really looking forward to seeing the new screenshots of COSM based on your new renderer.

Kev

princec · July 6, 2003, 8:18am

All waaaay over my head…

Cas

abies · July 6, 2003, 9:32am

I have just pressed reset during typing this and I had to retype… grrrrr.

Here is my render-order list:



1.Init frame time/number
2.Clear buffers
3.Priority behaviors running (input/camera), update camera and frustrum
4.Secondary behaviors running (fill dirty transform/bound nodes), perform auto-bounds geometry updates, schedule manual-bounds geometry updates
5.Recompute dirty transforms and bounds (hierarchy-traversal)
6.Cull nodes and fill to-draw list (hierarchy-traversal)
7.Sort to-draw list, opaque, transparent-order-indep, transparent-order-dep (list-traversal)
8.Perform manual-bounds geometry updates and draw sorted to-draw list in parallel (list-traversal)

I have not looked at shadows yet, so it is not included. Generally, order is similar - there are only few ways to do a scenegraph renderer. From what I see, main thing missing from your list is behavior/geometry update handling. On one hand, you need to process geomety updates early to be able to use new bounds for culling, on other hand, you want to parallelize it with rendering to not starve the GPU.

I’m thinking about two tricks here:

Split priority and secondary behaviour. Only priority behaviours can modify the view/camera - but they are run always, regardless of distance/frustrum. On the other hand, secondary behaviours can be distance/frustrum dependent (and this culled) - but camera is already set in stone.
Splitting manual-bounds-compute and auto-bounds-compute geometry updates. With mbc I will require code to set possibly updated bounds in behavior, well before geometry update itself is executed. In most cases, bounds will be set big enough at start and not modified later I suppose. This will allow to perform geometry updates parallel with rendering, using some kind of fences (if they will be needed at all). On the other hand, abc needs to be updated before occlusion culling, because new bounds need to be calculated and used there (to avoid culling of objects that had grown during update and should be visible). Anyway, I do not expect anybody (this means me) to use dynamic geometry with auto-bounds-compute

The biggest problem is state sorting. Identical appearances can be put together - but what with ones which are a bit different ? Texture changes are probably most expensive - but what with geometry changes ? Reusing already loaded vertex buffer can be also a big gain. If we have same geometry/appearance, in what order rest of changes should be put ? Is it really worth to sort it in complicated way for anything except textures/geometry ? Texture sorting is also non-trivial - it would be great to detect that only 1 texture unit has changed, while rest stay the same and cluster such objects, as opposed to looking at all textures at once.

I’m considering following trick - maybe you will find it useful. Each appearance component will implement following interface


public interface StateAttributes
{
    public void makeCurrent(RenderingContext ctx, StateAttributes previous);
}

On state change, renderer will call makeCurrent on any of appearance components which has changed compared to last state. Every component will also have ‘default’ object. So, for example, turning on custom PointAttributes will look like this

customPA.makeCurrent(ctx,defaultPA);

changing them to different ones

otherPA.makeCurrent(ctx,ctx.getState().getPointAttributes());

and going back to default

defaultPA.makeCurrent(ctx,ctx.getState().getPointAttributes());

Inside PointAttributes, code can look like this (I type it from head, so there can be some errors)


public void makeCurrent(RenderingContext ctx, StateAttributes previous)
{
  GL gl = ctx.getGl();
  PointAttributes old = (PointAttributes)previous;
  if ( old.getSize() != getSize() ) {
    gl.glPointSize(getSize());
  }
  if ( old.isAntialised() != isAntialiased() ) {
    isAntialised() ? gl.glEnable(GL.GL_POINT_SMOOTH) : gl.glDisable(GL.GL_POINT_SMOOTH);
  }
  ... same for attenutation/fade/minsize/maxsize
}

There will be also methods for comparing states inside StateAttributes, but I have no idea at the moment how it should exactly work. I’m thinking about returning signed weight of required changes, which would be later mutliplied by constant for given StateAttribute and directly used for sorting - but it is not trivial to describe such function so it will stay bidirectional.

William · July 7, 2003, 8:37am

I would just like to add that I think it’s great that you’re doing this, David, and I’m looking forward to the public release. This provides a much-needed direction in the vaccum that has been created by the freeze of Java3D and the uncertainty of what might come on top of JOGL.

DavidYazel · July 7, 2003, 11:45am

Thanks for everyone’s feedback! Over the long weekend I wrote the skeleton based on the new design. One improvement over what is above is that I seperate the renderer layer into two pieces, with the bottommost being 3 interfaces RenderPeer, ShaderPeer and RenderAtomPeer. Implementations of these interfaces should be the only places where pure 3d API calls are actually going to be made. This “should” allow the bottom layer to be LWJGL, JOGL or even Java3D (immediate mode). You can create a new Shader by implementing the Shader interface and the ShaderPeer interface and then registering the two in the renderer.

Another optimization I have worked in is something called a StateMap. StateMaps can be used by shaders and other discrete renderable items to create a uniqe map of instances. For example when a RenderAtom comes in, and before it is sorted in the RenderBin, it is processed with StateMappers. So when the AppearanceShader is processed, it in turn processes its NodeComponants (material, rendering attributes, etc) by passing them to the corresponding StateMap. The statemap holds a balanced tree where the key is the actual StateTrackable items (which are comparable). The first time we encounter a new StateTrackable a StateNode is placed into the tree. The state node contains a unique id, a reference count and a pointer to a copy of the StateTrackable item. The StateTrackable item is then flagged with the StateNode so that the netx time it is seen (if it is not dirty) there is no reason to look it up. This means that from that point forward sorting is done on the state id, not on the state contents. This is also nice for the final rendering stages since it can keep track of the current context only using the id’s and change state only when the id’s change.

When an identical material (but different instance) is checked against the tree it will find a match and then it will be flagged with the state node. The next time it comes in to be processed, if it is not marked as dirty it will retain the prior assigned StateNode and therefore keep the same id. So for 99 percent of the time the state id’s will be the same and for that matter many entire render bins will remain the same, allowing us to exploit frame to frame coherency.

Anyway things are cruising along so far, no major problems so far.

Herkules · July 7, 2003, 1:41pm

Do you retain some Java3D compatibility?

DavidYazel · July 7, 2003, 3:17pm

One thing we want to make clear is that we are building a gaming version of the Java3d API, not a pure port. Java3d is a wonderfully rich and powerful scenegraph which can meet many different needs, from scientific visualization to cad/cam, etc. With this generality comes a cost however.

So we have quite a lot of compatibility with the Java3d API. (BTW is it legally allowed to use the same method signatures, etc or do they represent protected informaiton?) We are trying to minimize the work in porting from Java3d to our engine. So all the familar scene graph nodes are there, etc.

But we are concentrating on rendering in particular. This first version will have no behavior support for example.

The biggest change architecturally is at the top of the scene. We are doing away with view platform and attaching the view transformation right to the view.

We are also, for the sake of performance, making the scenegraph non-reentrant (non thread safe).

So the way it works is:

Read Inputs
Update scenegraph
Render scene
Goto step 1.

Now the “render scene” part does not include waiting for OpenGL to finish, but represents that all the RenderAtoms have been sent to the OpenGL driver through the low level API.

By using this approach we do not need the famous “retained” copy stored along side the scene. Also it allows for geometry generators (like BSP) to generate in-line rather than building a geometry array and sending it to be processed.

For our purposes we will not be using AWT or Swing at all, giving us the fastest flat out speed we can do (hopefully).

Breakfast · July 8, 2003, 8:07am

I’m really glad you are doing this- from what I can tell you probably know a good deal more about using Java3D for games than anyone. As soon as it gets down to the level where I have some chance of being useful I’ll be more than happy to offer any assistance I can

Herkules · July 8, 2003, 8:50am

The renderer design looks OK for us, we only use one single behavior now to drive the whole game. This could easily be achieved with a gameloop as well.

LWJGL would be of little interest due to it’s restrictions, JOGL would be the choice.

But more important: we should try to motivate the authors of the loaders to port their stuff. Without loaders (in our case John Wrights 3DS loader), the gap between artwork and game still is quite big.

BTW, David, I’m deeply impressed by the speed you make things happen and how people are willing to follow.

abies · July 8, 2003, 10:14am

I’m trying a bit different route than David, but maybe it is worth considering as far as loaders are concerned. I have defined umpteen interfaces for scenegraph objects, plus few factories for creating them. All the rest of internals are hidden inside implementation classes. In theory, it should allow to write generic loader which would work with any renderer implementing these interfaces. I’m trying to design them in way they are not really dependent on underlying structures.

Do you think it would be possible to agree on some kind of interfaces which would abstract specific renderer details ? I’ll try to post my proposal later today, to have some base to comment on.

Herkules · July 8, 2003, 10:41am

Hm, renderer and loaders should have no contact anyway. There is the engine (scenegraph) between them.

shawnkendall · July 8, 2003, 11:46am

Great progress, sorry I’ve been out, was away for a week on vacation!

We are still doing similiar development, and similarly plan to release something working when we have a decent demo.
After that we will probably rework what we have and perhaps can come to a more common approach for the scene graph renderer with the community.

So far, your description sounds quite good, if not overbuilt for how we would like to see a renderer.
We are going for a more modular approach, for example, a simple drawlist system that suports adds/caches/inserts based on an render attribute hash or similar, and then a simple loop to render single transform objects, and allows for shader execution, as well as various other “pieces”.
The traverser walks the graph, flatten transform stacks and doing sphere bounds culling as well but does not use OGL push and pops since it’s output will be flattened transforms and geometry data (classic graphics object) into the draw list.
The drawlist renderer will push and pop any object the has a transform but never works on a hierarchy. This avoids any issues that the 32 max transform stack can have in OGL. It also allows any other drawlist generation system to be used instead of a graph traversal, etc.

We want to create a render toolkit as opposed to a render “engine” so it can be assembled rather than inherited or rewritten to do app specific render execution.

Still, the discussion is great!

bmyers · July 8, 2003, 3:05pm

Keep it up, David and Shawn and abies! Although, personally, I am of two minds on this discussion:

Mind #1: I am really happy that there’s several groups who are all working very hard on filling the void left by the Java3D “stall”.

Mind #2: I am disappointed that there’s now going to be at least 4 (at last count) different scene graph APIs that are all somewhat incompatible with each other and with Java3D.

One of the big reasons our team went with Java3D in the first place was because it was a standardized API that people could add content-specific modules to (e.g. Loaders), and I didn’t have to worry about how compatible they were, and how much work I would have to throw away. Plus I knew it was supported by Sun. Now it looks like a bunch of re-work might be coming in the near future.

Oh, well… :-/

Currently, most of our 3D programming work is on the back burner while we wait for the 3D scenegraph API shuffle to settle out. We’re mostly working on the AI system and the 2D stuff in the meantime, which for our game is just as important as the 3D stuff.

Question for Dave:
Are you using joal and jinput, or are you using a different set of packages for audio and input? Or is that something that is up to the user to hook in? thx!

shawnkendall · July 8, 2003, 3:17pm

[quote]Keep it up, David and Shawn and abies! Although, personally, I am of two minds on this discussion:

Mind #1: I am really happy that there’s several groups who are all working very hard on filling the void left by the Java3D “stall”.

Mind #2: I am disappointed that there’s now going to be at least 4 (at last count) different scene graph APIs that are all somewhat incompatible with each other and with Java3D.

One of the big reasons our team went with Java3D in the first place was because it was a standardized API that people could add content-specific modules to (e.g. Loaders), and I didn’t have to worry about how compatible they were, and how much work I would have to throw away. Plus I knew it was supported by Sun. Now it looks like a bunch of re-work might be coming in the near future.

Oh, well… :-/
[/quote]
I believe that the developers of the J3D-like renderers/systems understand this completely.
We are trying to make sure that the renderers will be abel to operate on a J3D compatible graph so as much existing loaders, etc will still be usable, and that replacement/intergration with existing game apps will be as smooth as possible.
It is true, if you use ALLOT of J3D’s other functionality (picking, collision, behaviors, sound) then you will probably not be able to switch anytime soon.
However, as more projects face this, I imagine(hope!) that those pieces may start to appear as well.

Personally, we only used the renderer of J3D in all our projects and less and less of the other features each later project. Most people wrote their own collision support, and the behavior model was always a problem, most only used the WakeOnFrameElapsed(0) which will port easy to our renderers.

But keep on voicing those opinions to keep us all on track!

bmyers · July 8, 2003, 4:21pm

We are using picking and lots of of behaviors which would all have to be merged into one big uber-behavior if we were try to port now. We’re also using Java3D sound, but have had so many problems with it that it’s turned off right now.

Our problem is that our game is in many ways a data visualization game (read: simulation/strategy), which is what Java3D is really well suited for. So it’s made sense for us (so far) to use it as it was originally intended, which means using behaviors extensively, especially DistanceLOD and various Interpolators, as well as our own custom behaviors.

I can’t imagine that I’m the only one in this boat – there must be other people using behaviors ???

Oh, I will! ;D

shawnkendall · July 8, 2003, 4:39pm

I will comment on the use of LOD behaviors in Java3D…

It is a nice convenence that J3D has LOD behaviors hanging in the scene graph. It makes much of the operations transparent, as a loader and model can be made to set up the LODs and off they go.

However, this localized approach has many cons, and in most games is often overwritten to be controlled at a higher level by some sort of game manager.

For example, what if you decide you would like to modifiy all LODs to switch at a later distance for lower requirement hardware?
Or perhaps you have a terrain system that knows about the view changes at a higher level, so there is no need to recompute each LOD distance every frame, but only when the viewer is moved across a tile barrier(LOD checks can add up).
Or what if you wish to use a Cell and Portal style viewing system. That type can cut of completely large portions of a scene graph, but the behaviors have to be managed as well, and if oyu are cutting them on an off from large object lists, or travsing the graph to find them, you might as well control them directly IMHO.

These are the kind of things that are difficult at best to do with the localized behavior system in J3D (or any behavior object supporting scene graph).

shawnkendall · July 8, 2003, 4:44pm

On picking…

I meant to say this earlier, but we do not use the pickign utils except in tools. In profiling they are shown to be fairly inefficient but most important they generate allot of garbage objects. They are more of an example of how to do picking althought they are generally complete.’

In any case, they are open source and could be recompiled/ported to work on the new graphs (provided the graphs reach that depth of compatiblity) as is so I imagine you will be ok there.

Our viewer using them, so our graph will have to work with them. In fact, I think that is a good test of compatibility

abies · July 8, 2003, 5:52pm

Yes, some of them are
I have used behaviours in NWN model loader for implementing particles/emitters and animation. For me, it is an example that even loader (not core game itself) needs some kind of support for behaviours - after all, there is only a limited set of things you can do with static data.

bmyers · July 8, 2003, 7:34pm

For example, what if you decide you would like to modifiy all LODs to switch at a later distance for lower requirement hardware?
Or perhaps you have a terrain system that knows about the view changes at a higher level, so there is no need to recompute each LOD distance every frame, but only when the viewer is moved across a tile barrier(LOD checks can add up).
Or what if you wish to use a Cell and Portal style viewing system. That type can cut of completely large portions of a scene graph, but the behaviors have to be managed as well, and if oyu are cutting them on an off from large object lists, or travsing the graph to find them, you might as well control them directly IMHO.

These are the kind of things that are difficult at best to do with the localized behavior system in J3D (or any behavior object supporting scene graph).

Well, we’re not doing any of those things…

Although it’s probable that we would change LOD distances based on hardware or user preference, so you have a good point there…

For terrain we have about 1500 static shapes, each of which has an associated DistanceLOD, and Java3D seems to handle it perfectly well.

We also have a lot of LODs for the units, however (potentially several 1000’s), so maybe that is pretty inefficient and could (will) be redesigned. But we don’t move by tile boundaries, and we have vertical altitude as well as changing horizons (because we’re dealing with spherical planets) to deal with, so the distance calculation from the view would still need to be made for each unit, if the unit is inside the visible distance, although I guess if we wrote our own LOD-manager we could use L-distances instead of actual distances to avoid the square root.

As far as picking, the garbage that the Java 3D picking utilities generates is not significant for our game, so it’s not been crucial for us to roll our own version (yet). And we have lots of natural pauses built into our gameplay, so garbage collection hasn’t been an issue for us. At least not so far! And we’re not using picking for collision detection at all.

So, in general, the current Java3D stuff has met our needs fairly well (except for the lack of high-level animation infrastucture, and sound, and some bugs). Which is why we’re going to keep using it until something nearly as good emerges, and is supported and embraced by enough developers to ensure that content tools like loaders are available.

Which would have been Java 3D 1.4 if things were different… :’(

But I’ll be keeping an eye on the scenegraph APIs being developed now by you and others – and I’ll contribute to them when the time comes!