I wasn’t sure where to post this, so forgive me if this is the wrong place.
I have completed my first pass at a renderer traversing a Java3d standard scenegraph. Not all nodes are implemented, but most are. This toy renderer traverse the scene and uses LWJGL opengl calls in place. Simple scenes are very fast and as expected this drops off fast as the complexity increases.
Now that things seem to be working smoothly I am designing a better rendering pipeline and wanted to see if it seems to make sense.
One of the things you gain when traversing a scenegraph is that you can use the opengl matrix stack to handle your transformations. The problem with that the scenegraph is really there to support the logical view of the data and is not optimized for rendering. I need to re-order the way that geometry is sent to the card so that I can minimize state transfers.
So my basic idea is to do the following:
-
Traverse the scene and for transform groups I would calculate the local to vworld transform, multiplying them down the tree. I would only recalculate this if the transform or a parent transform has been marked dirty through a change since the last render.
-
As I got to nodes I would translate their bounding spheres using the nodes local to vworld transform. This would shift the node bounds in the SphereTree (extremely efficient for making changes).
-
Traverse the sphere tree and perform frustum culling, skipping any nodes which are outside frustum planes. Keep track via stacks the current states. So if I hit a fog node that is good from that location down, I would push the fog node on the fog stack. When I come back up through the node I would pop it off.
-
When I got to a node that is to be rendered I would package it up with its shaders ( current state of appearance + fog + lights, etc). This would push a package on the render bin which had everything needed to render.
Some nodes can also be view specific geometry generators, including BSP generators which can produce their own packages for rendering. These would be triggered and they would produce shader+geometry packages into the render bin.
-
Special nodes called shadow occluders can be added by users into a branch group to represent the occluder for the “model” which exists below it in the scenegraph. These are pushed into a special queue in the renderer. The geometry can be supplied for the occluder, or it can be built on the fly (and cached) from all the geometry which exist in descending nodes.
-
Some of the packages which are placed into the render bin can contain other packages. Examples of this are decal groups and order groups.
-
The render bin is sorted by shader constitutions to minimize the state changes. Transparent shaders and thier nodes are placed into a second bin for a later pass.
-
Render bins are submitted to the geometry pipeline which: 1) steps through each package, changes only the delta state (including doing a glLoadMatrix for the transform if there is a change) 2) manages vertex arrays + texture bindings 3) renders geomtry.
-
The first pass would be the opaqe render bin, then each package container (ordered groups, decal groups) would be sorted and sent to the geometry pipeline.
-
Shadows would rendered using zbuffer shadow volumes (carmacks reverse zpass/fail)
-
The transparent package bin is then processed. Specially flagged packages would be combined and broken up by polygons into mini-chunks. Regular packages will be added in their larger chunks. This bin is then sorted by distance to the view and submitted to the geometry pipeline.
-
Last step would be to render overlays.
I also plan to add more advanced shaders than just the standard appearance. Packages flagged for dynamic per-pixel lighting can have their vertex colors encoded on the fly with the light vector so we apply 3d lightmaps + bump + cubemap normalizer, etc.
I am designing this to be generic as possible with plugins into the rendering pipeline and graphics pipeline to allow different techniques to be used. It should be also possible to add in a occlusion culling plugin so that different techniques (e.g. Zhang’s hierarchical z pyramid) for late pipeline checking or even an early pipline PVS check.
Are there any glaring holes in the basic outline I have described?
Once this engine is done (good enough to run Magicosm on) I will release it for any use and eventually open it up for other developers to contribute. At worst it should be an example others can learn good and bad things from.