OpenGL Perf Tuning

erikd · April 25, 2006, 11:06pm

Why do fallback code for something which you need a fast GPU for anyway? Wouldn’t the fallback mode just suck, performance wise?
I guess if you do a really wicked effect with the GPU, you’d need to write something completely different as a fall back (‘cheat’ a similar looking effect), or the fall back would be to just leave the effect out, no?

Riven · April 25, 2006, 11:10pm

If you have keyframe animation for your characters, you have to do it some way, either on the CPU or on the GPU

on the CPU it is “kinda costly”, but very possible on low-poly models
on the GPU it is “almost free”

So yes, a fallback mode is - in this case - a justified feature.

darkprophet · April 26, 2006, 12:41am

Your forgetting JNI overhead (which at 1.6 is around 1100ns), and the perf factor grows to around 4.3ish with sqrt in the equation. The hardest part is actually getting the data to be nice and linear in the float buffers to avoid cache misses in the FIFO, this brings a whole new light to the word “complex”…

And how are you sending two vertex arrays and two normal arrays? Unless your filling the texture coordinate slots with data, then I dont see how your doing it. Ofcourse if you are filling the texture coordinates with data, then you’ve got a whole can of worms to deal with that makes it really really restrictive for anything more useful than “grass” (which you can use pseudo-instantiation for anyway)

DP

kevglass · April 26, 2006, 3:10am

Now I might not know as much about GL performance as you chaps but I tried this recently. First, my low end card doesn’t support VBO. So, you end up pumping a VA of a full model to the card every interpolation step (most people seem to do this every frame for uber smoothness). Even if you have a VBO - if you’re using something like a quake model then pretty much every vertex is changing every step so you’re not actually going to save much by updating partial VBOs (given you’ve got to work out what to change aswell).

The problem with fallbacks is they have to at least be serviceable (i.e. look reasonable and run as fast as the “real” version). The right option to me is to choose who you’re trying to hit and go for their hardware.

Kev

PS. Got my pseudo-interpolation running quick as a fall back by generating a lot of display lists at initialisation time - interpolating between the key frames read from disk - and then using these at runtime. Means you’ve only got a set of animations and you have to choose your interpolation interval up. This costs lots of graphics memory but this is something even low end cards tend to have ample of

Riven · April 26, 2006, 3:49am

Do you know you can embed VAs in your displaylists, to take advantage of indexed-geometry? (saving quite some vRAM)

It can be done like this:

glVertexPointer(...);

// begin list
glDrawElements(...);
// end list

Note that assigning the VA-pointers must not be done inside the list, according to the OpenGL-spec.

darkprophet · May 18, 2006, 11:07pm

Not really a performance related feature, but more of a trick…

If your making an FPS, disable writing to the depth buffer when drawing your gun and draw the gun last. This saves you from the horrible effect of the gun going into the stuff around it (walls, trees…etc), it looks reasonable. Only bad point is when you have shadows, you’ll see the shadow of the gun on the wall and the point of the gun on the wall with the shadow and the gun touching, giving the impression that the gun is right up to the wall. But if you keep rotating, the player would think the gun should have gone into the wall, but it doesn’t.

If you keep the gameplay nice, the casual player wouldn’t even notice it in the slightest.

DP