Performance & Benchmarks

That is really simple answer I mean simple solution but it wouldnt work really with terrain - unless one wants to go with precomputed tile-system. But in any case I was just saying that your point d) was too generic. In my system I am culling by tris and performence is not bad. But I agree with the rest you said.

Sometimes sorting front to back for opaque geometry is more important than texture sorting. If you have enough texture memory on GPU, fill rate benefit of front to back sorting can be bigger than penalty of texture trashing.

That is true, particularly if the distant objects have multi-texturing. The latest cards also have z-buffer optimisations that can speed the z-test up a lot. Older cards (early geforces, ATI 7500 or earlier) won’t benefit much, as the z-test is concurrent with the texture fetch, and keeping multiple pixel pipelines in sync prevents the early-out. However, if you are in a state with large overdraw like this you would be wise to be looking at techniques such as portals & basic occlusion comparisons to cull objects behind large opaque objects.

The simplest occlusion method is to have several visibility spheres on an object - large ones encompassing the whole object for frustum checking, and a smaller ‘inner’ spheres representing the opaque regions/ If an objects ‘outer’ sphere when rendered lies within a closer objects ‘inner’ sphere then it is occluded, so you can cull. You can use any shape, and there are articles (check Graphics Gems & Gamasutra). the most useful application for this type of system is for rendering city scenes. In this case, buildings want an inner cuboid for occlusion, and cars/pedestrians check their cull-spheres against this. Very fast & saves you even sending the model data to the hardware. Spend a little effort early to save the card a lot of effort later.

It all depends on your situation & what you are rendering, so there is no universal method that is guarenteed for all cases unfortunately, but I would say that texture batching is the most generally useful optimisation for PC cards.

Oh - and if you can, pack multiple small (no-tiling) textures onto larger packed pages, as this saves you a large amount of batching. You can cut down from 200+ individual textures to 20-30 pages and save massively on batching overhead & state changes. (figures taken from memory of a published game I worked on)

  • Dom

[quote]If you have enough texture memory on GPU, fill rate benefit of front to back sorting can be bigger than penalty of texture trashing.
[/quote]
Thats a pretty big ‘if’ though, most games use plenty of textures, but whether they suffer from lots of overdraw that front to back rendering would help with is debatable.

Of course Xith has stencil shadows built in doesn’t it? So before doing your proper texturing and lighting rendering you need to create a perfect zbuffer anyway so you get to sort by material and get the benifit of early z-fail :slight_smile:

While we’re on the topic of optimisation, has anyone thought of adding the use of a HOM to Xith (perhaps adapted from http://www.jpct.net/jpct.htm ) to get accurate occulsion queries without clogging up the graphics card?

[quote]It’s the first time I’ve seen 162k polygons at 28 fps and high res screen with Java - out of a real world example where the whole universe has many many more objects and polygons.
Well, since Xith is the rendering engine it all has to do with Xith, isn’t it? Since Xith uses OpenGL it has to do with Jogl, too.
[/quote]
Fallacious reasoning. if the scene never changes then all the work that is done by Xith OR JOGL is happening before the first frame. Frame counting thus is meaningless.

I’m not suprised this is the first time you’ve seen this kind of rate as Video cards keep getting better. In the purest case it has nothing to do with Java, Xith or JOGL.

In reality ina real app like mgicosm there ARE changes going on, so frame rate does have some meaning, but the polycount is still pretty much irellevent. The number of state changes and amoutn of texture info that has to be moved across the bus are both likely to be significant. The amount of work culled out is also likely to be significant. But the 'total polys in the scene" just isn’t a terribly important measure in of itself and tells you nothing outside of the ability of your graphics card.

[quote]Fallacious reasoning. if the scene never changes then all the work that is done by Xith OR JOGL is happening before the first frame. Frame counting thus is meaningless.
[/quote]
I agree it depends on the movement of the scene. :slight_smile:

[quote]I’m not suprised this is the first time you’ve seen this kind of rate as Video cards keep getting better. In the purest case it has nothing to do with Java, Xith or JOGL.
[/quote]
Well I meant I haven’t seen such thing with a Java program. At Jediknights-III I like high FPS rates.

I’ve noticed that the FPS number of a test scene (200 k polys) halves when I switch the polygon mode from filled mode to line mode.
No matter if I do it manually or with Xith’s nice Renderoption.setOption(Option.ENABLE_WIREFRAME_MODE, true);

Since I don’t plan to go for wireframe in the end, I don’t mind too much. :wink:

However maybe any HW/OpenGL expert would like to explain why wireframe is slower than filled mode, please? When some++ years ago I implemented a polygon raster fill routine on good old Amstrad CPC (8 bit) it’s been the other way round.

**shrug **

Then you haven’t seen much.

3 years ago at GDC on then-average hardware we showed Shawn’s Java3D FPS running at max monitor frame rate.

We showed the same demo at Quakecon 2 years ago and were told repeatedly that it compared favorably to Quake 2.

[quote]However maybe any HW/OpenGL expert would like to explain why wireframe is slower than filled mode, please? When some++ years ago I implemented a polygon raster fill routine on good old Amstrad CPC (8 bit) it’s been the other way round.
[/quote]
I’m not an NVidia expert so this is a guess, but I wouldn’t be surpised to find out they had optimized the usual case for the hardware (textured) and not the line case which is a lot more unusual.

Is it doing aliased or anti-aliased lines? Properly anti-aliasing lines can be some work.

Still Jeff, what people are looking for when they come here is a sense of whether Xith3D is up to the challenge of rendering commercial quality scenes with typical polygon / texture / shading loads. Other than pointing them at current usages it is hard to quantify the performance.

The questions will never go away. They don’t really want to know how many polygons per second, they really want to know “can it do what I need it to do as fast as I need it to”. But this question is impossible to answer without a detail spec defining what “they need”, and even then it would be difficult to give them the answer they seek. What they are hoping for is that there is some metric which can roughly approximate the answer and give them some confidence that the proposed solution is real and not imagined.

The other thing is that users of Xith3D and other engines dont generally understand rendering designs and engines and don’t want to… and there is a suspician that any open source engine would fall short of commercial strength. So they come here looking for information and they get anecdotal answers… no hard numbers. You can’t fault people for wanting hard numbers, even if they don’t know that the hard numbers don’t really answer the question they want to ask.

Sounds well.
Though I’ve been visiting java.sun.com for years (irregularly) I’ve unfortunately never seen impressive 3d stuff. Neither did anybody of my friends or colleagues.
I’ve installed Java3d 3-4 years ago, played with the examples and while they’ve been nice I couldn’t impress anybody with it. Has since then Java3d been optimized for game usage, new OpenGL extensions or such? Don’t know. Last thing I read her some months ago has been that Java3d has been frozen.

Now I found Xith3d, asked some questions here, got very positive answers by the developers, then fed Xith3d with some 200 k poly test scenes and showed it to a few friends and they said: That’s cool.

Usually, if you ask some 3d artist to do some 3d models for a 3d action game the first question he asks is: low or high poly models, how many polys allowed, and such.

I think the same.

[quote]Is it doing aliased or anti-aliased lines? Properly anti-aliasing lines can be some work.
[/quote]
It’s been normal line mode, no anti-aliasing.

[quote]Still Jeff, what people are looking for when they come here is a sense of whether Xith3D is up to the challenge of rendering commercial quality scenes with typical polygon / texture / shading loads. Other than pointing them at current usages it is hard to quantify the performance.
[/quote]
Said very sensibly.

[quote]they really want to know “can it do what I need it to do as fast as I need it to”. But this question is impossible to answer without a detail spec defining what “they need”, and even then it would be difficult to give them the answer they seek. What they are hoping for is that there is some metric which can roughly approximate the answer and give them some confidence that the proposed solution is real and not imagined.
[/quote]
Couldn’t say it better.

[quote]and there is a suspician that any open source engine would fall short of commercial strength.
[/quote]
That’s true, too. Still too many people think Opensource isn’t as good as commercial software. However in many cases the opposite is true.
As a long time Opensource user I know this well. (Couldn’t work anymore without Gawk, Jedit, MAME, Openoffice, …)

[quote]So they come here looking for information and they get anecdotal answers… no hard numbers.
[/quote]
Yes. :slight_smile:

Line mode on all PC cards is terrible, because the driver actually draws them a 6 tris to make 3 rectangles 1 pixel wide each. Its not 6 times slower as you save on the fillrate & texture lookups, but it is pretty shocking nevertheless.
The same behaviour happens on ATI, NVidia, & Matrox cards last time I looked.

As an aside, I used to write commercial graphics engines, & static scenes of 200k tris is on the high side. For xBox (GeForce 3/4 level), we aim for around 100k tris on screen at any time (50 fps), split 50/50 for static background, the rest are dynamic lit objects. The major frame issues are due to skinned characters, morph targets, multi-textures, particle effects, & shadows. These can easily take over 1/2 a frame alone, so you need to get your scene rendering in less than half a frame if you can. And don’t forget your UI - font rendering alone can be a pain as its a large fill-rate transparency with very low poly counts.

  • Dom

[quote]Line mode on all PC cards is terrible, because (…)
[/quote]
Very interesting!

[quote]As an aside, I used to write commercial graphics engines, & static scenes of 200k tris is on the high side.
[/quote]
I see.

quote skinned characters, morph targets, multi-textures, particle effects, & shadows. These can easily take over 1/2 a frame alone, so you need to get your scene rendering in less than half a frame if you can.
[/quote]
Oh yes. It’s just the beginning. However since I use Xith3d it’s much more fun now compared to the direct OpenGL way. :slight_smile:

[quote]Still Jeff, what people are looking for when they come here is a sense of whether Xith3D is up to the challenge of rendering commercial quality scenes with typical polygon / texture / shading loads. Other than pointing them at current usages it is hard to quantify the [snip]performance.
What they are hoping for is that there is some metric which can roughly approximate the answer and give them some confidence that the proposed solution is real and not imagined.
[/quote]
Agreed. My point is simply that polys rendered on screen and number of frames per sec of those polys is NOT a useful metric today. There is a time when renderers were in software when it was. But today its a totally pointless measure. Its worth educating people on why.

“There are three kinds of lies. Lies, damn lies, and statistics.”
~Mark Twain~

Fact of the matter is that numbers are no “harder” then the annecodtes. One or two simple numebrs just aren’t going to tell the story, as you pointed out as well.

So in fact, the BEST measure turns out to BE annecdotal, where those annecdotes are as close as possible to your intended application.

P.S. And remember, 50% of Americans graduated in the bottom half of their high-school class.

I personally think that for now the major goal is to add functionality to Xith3D, so it is at least the same level [of functionality] as Java3D. Then we switch to agressive optimizations.

BTW, some people already started working on performance enhancements, so we have progress also in this area.

Regarding the performance question in general, I see major speed-ups on strategical engine desing, rather than on code fine-tuning [which does not mean we allowed to write poor code].

Now on performance tests. I think we should have as many tests as possible, at least to figure out the problematic points in design and implementation. This will help to enhance the entire engine, as well as write the performance tips, how-to’s etc.

Yuri
[/quote]
(still resurrecting old threads).

Yuri, if you ever come here again, I don’t ask you even to optimize yourself but please tell us what can be done…