I have finished writing a Quake3 map loader. I was able to load and display several maps last night without issue. I am working on a walkthrough demo with collision to be available hopefully this weekend for you to try. I will be checking this code in to the xith-toolkit project.
The first version of the Quake renderer ran at 0.5 FPS due to the fact that I was not using the cluster visibility provided with the BSP. I integrated the VIS stuff now and brought the FPS up to 24 FPS. After doing a bunch of profiling and found a few bottlenecks in Xith3D and removed them and now am running at 110-220 FPS depending on the location and view inside the level. I don’t know if you guys are familiar with the way FPS levels are stored, but the faces (which are really Shapes, a whole bezier patch is considered one face) in leaves are referenced if there is a non-zero intersection with the leaf bounds, which means one face can often be shared by multiple leaves. So when you do your VIS by cluster you need to assemble a list of faces which are non duplicate in the potential visibility cluster information. This list is then culled per frame and rendered. So I created a huge switch with an entry for each face. I created some optimized code in View so that it could handle large switches more quickly (using bitSet.nextSet() - very cool feature). Now 99 percent of the time is spent texture switching and rendering all those itsy bitsy faces. The really wierd thing is that by the time you do all this PVS and FOV culling you are leff with very little triangles… sometimes 300-500. I would have expected the framerate to be much higher.
There are a couple of current ineffiencies in the current Xith3D. One is that multi-tetxure geometry is not getting sorted very well. In the case of Quake you have large lightmaps which are shared for many faces, so you really want to sort first by the second texture unit and second by the main texture unit. Discovering this automatically might be extremely difficult, so perhaps we should create a shape hint or sort policy which can guide the renderer on how to sort the shapes.
The next problem is that currently multi-textured is not good about handling state changes as well as non-multitextured geometry is. So for example when redering quake faces the geometry continues to enable unit 1, bind lightmap, render and then disable unit 1 which is incredibly expensive. The same is also true of unit 0. So I will have to improve that radically. I think that we should also add some optimization into the state sorter so it can basically say “all other things being equal, render all these geometries with the same exact state.” Which is a bit like combining shapes into one shape dynamically, but at a lower level than the scenegraph.
any other ideas or thoughts would be welcome.
Quite important topic. I was waiting for proper time to bring this up, and looks like this is now.
I was also thinking about the possible efficiency and flexibility of rendering process in Xith3D, and have several suggestions.
- Multipass rendering. You know this is needed for many different things. My idea was to allow to specify explicitly rendering pass (or even multiple passes) for every specific node in a scenegraph and let user define number of passes, their sequence and sorting type. Sometimes distance sorting for opaque geometry bringh benefit also, so I was thinking about completely flexible pass definition, which is rather simple: user defines number rendering passes for View (?) and then sets sorting policy for every pass (i.e. FrontToBack, BackToFront, None, State, custom, etc.), and then associates rendering pass bits with every shape in node. Such a way we will achieve good flexibility.
This can be also a very elegant hint for state sorder: you place shapes you want to render sequentially in the same pass and let sorter sort the rest.
- State Sorting. As of state sorting, we definitely have to make it more intelligent, and sorter hints will be very useful. But before we need a king of benchmarking procedure or better set of procedures. In my game I get ~65-75fps on 1600x1200 32bpp GeF 440 Go, but scene is very specific.
Yuri
Some screenshots of a quake3 level being rendered through xith3d. I can walk through the level with full collision detection / sliding along surfaces with gravity. All running very smoothly.
Dave
very impressive!
Can’t wait for the walkthrough-demo.
(I need to impress some sceptical friends who still think a 3d Java engine wouldn’t be as fast as a C++ one…
[quote]I integrated the VIS stuff now and brought the FPS up to 24 FPS. After doing a bunch of profiling and found a few bottlenecks in Xith3D and removed them and now am running at 110-220 FPS depending on the location and view inside the level.
[/quote]
I really want testdrive this demo on my Cel500Mhz machine. None will use such machine for a real gaming but is very good fps benchmarking platform.
[quote] I have finished writing a Quake3 map loader. I was able to load and display several maps last night without issue. I am working on a walkthrough demo with collision to be available hopefully this weekend for you to try. I will be checking this code in to the xith-toolkit project.
[/quote]
Hi
I have currently found this impressive xith project and am playing with its features. Due to the fact that I checked out another quake level viewer being based upon gl4java around christmas it would be very interesting for me to compare both. Especially since I have been thinking of a port to jogl before that could be obsolete by now.
But I have not been able to find your code in the xith-toolkit project. Could you please give a hint when to expect it? Thanks.
@Qudus : interesting points here, have you worked yet on multi-texturing issues ? (Note : I’ve not read yet the “Shader order” topic).
Hi,
Note that this topic dated back to 2004 - it’s a time ago…
Yes, I was working on multitexturing performance issues also and especially on state sorting.
Unfortunately, state sorting in respect of multitexturing is still not perfect, and even far from perfect. But on the other hand as I mentioned before in that post - we still need a benchmark or example that will show that state sorting will help: a year ago I was surprized that in newer drivers when textures are already in VRAM changes in state sorting do not give so much performance boost… but I assume it is due to too simple example.
State sorting is VERY important on low-end graphic cards (which were high-end that time), or when textures have to swap out of VRAM.
Yuri
P.S. If you have some specific questions on Xith3D State Sorting let me know - I was partially developing it…
Myeah, do you plan to take it to the next level ? That would be more than appreciable
In short words, first we should prove that proper (better, other) state sorting will give performance boost.
After we locate the bttleneck I believe we can invent the more intelligent sorting algorithm. So if you can help in locating the bottleneck it will be of even more help than, say, me making some changes
Yuri
So I could e.g. profile the Q3 test with YourKit and see where most of the CPU time goes ?
For Quake3 test I’ve disabled render atom sorting completely. It boosts the performance a lot (or decreases it when enabled).
Hi,
To make correct test you have to disable also state change optimization - it is active even when state sorting is off.
As I already mentioned, on modern cards and drivers state sorting has no such a big effect like in order cards/drivers, so turning it off may casue performance boost, but all of this is really scene-dependent.
I suggest the following test: imagine screen filled with non-intersecting shapes textured with two or three different textures (so neither depth buffer nor culling will affect the test). We make many small simple shapes (rectangles or even triangles). In this case, the rendering result will not depend on the shape rendering order. Next we try to profile timing (FPS) when 1) all A-textured shapes added first, then B-textured, then C-textured; 2) Shapes added interleaved way (one A-textured, one B-textured, one C-textured, then again one A-textured, one B-textured, one C-textured, etc.). All other appearance attributes are the same (after we can experiment also with other shaders, but this example is for texture shader only). I also assume state sortung turned off, so we do manual state sorting. All shapes go to the same group. We should speak of thousands of shapes of approx. 2x2 screen pixels size, so lets make screen context fixed size (say, 1000x1000 screen pixels).
As I mentioned, rendering result will be the same because of shape projections do not internsect in screen coordinate space, i.e. we have no overpaint. Case 1) gives perfect state sorting results (in respect of pipeline state changes); 2) gives absolutely wrong result (contra-productive) - we have state change for every shape.
Let’s somebody code the test and we all can compare results - no changes in core are neccessary for this.
I have a guess with the results and also explanations for it, but will keep it until we get the test and numbers.
Even more interesting will be the test when we use more complicated shader (say, VP, FP, or GLSL shader) instead of simple texturing - the cost of context setup for it sounds much higher, but maybe I am also wrong.
I strongly believe that we should find out what is the best for hardware and the driver, and optimize Xith3D core for this alone with providing Xith3D users with scenegraph creation guidelines, because of as you can probably feel from the test example above, the OpenGL command output can be quite different for the scenegraphs providing the same visual result.
Yuri
How do you do that, man ? In the ColorCube example that would be ideal (lots of Shape3D so disabling render atom sorting boosts a lot and furthermore it would remove the artifacts when they get re-ordered the transparency doesn’t look the same).
Hi,
Search in sources for TRANSPARENCY_SORT_NONE and OPAQUE_SORT_NONE.
Set using Renderer.setOpaqueSortingPolicy(…) and Renderer.setTransparencySortingPolicy(…) - this will make it. This feature already existing for a long time, I even remember I documented it while was adding TRANSPARENCY_SORT_BOUNDING_SPHERE_AND_EYE_RAY_INTERSECTION sorting policy (OK, was fighting a tricky intersecting shape transparency rendering that time).
Yuri
env.getRenderer().setOpaqueSortingPolicy( Renderer.OPAQUE_SORT_NONE );
or without Xith3DEnvironment:
myUniverse.getRenderer().setOpaqueSortingPolicy( Renderer.OPAQUE_SORT_NONE );
Marvin
OK, thanks.