glDrawElements or glInterleavedArrays

gregorypierce · March 6, 2003, 4:11am

So I’ve kinda come to a dilema in my coding. Does it make sense to use glInterleavedArrays which would greatly reduce the JNI overhead since it requires a single function call to actually draw stuff, or do I use glDrawElements or similar since I can lock them which should give me faster drawing speed.

What are you guys doing? What has been the experience of folks drawing large numbers of triangles?

elias · March 6, 2003, 7:21am

Something is not right here, glInterleavedArrays is shorthand for a bunch of calls to glVertexPointer, glNormalPointer etc. That is, glInterleavedArrays only specify vertex arrays, but doesn’t actually draw anything… glDrawElements on the other hand does the drawing from the arrays specified.

Now here’s what I do: Use glInterleavedArrays because

The jni overhead is smaller - only one call to specify all my arrays (almost)
There’s a (theoretical) speedup according to the red book when using interleaved instead of separate arrays.
I change position, normal and color at the same time, so it won’t help me to have arrays stored separately.

To draw I use glMultiDrawElements, because it enables you to draw many separate lists of triangles in one call. (only available as standard GL from OpenGL 1.4 though)

elias

gregorypierce · March 6, 2003, 1:12pm

Oh man I was very tired, sorry it was late ;D

The question should have been about drawing using glDrawElements with GL.TRIANGLES or GL.TRIANGLE_STRIPS. Given the processing that might be required to compute triangle strips, is the overhead of ‘essentially’ multiple calls to the driver by drawing multiple triangles a lot worse that generating triangle strips and rendering those (some of the geometry is dynamic).

elias · March 6, 2003, 1:15pm

Don’t know really - the overhead is offset alot using glDrawMultiArrays as I said. In our stripifier a threshold is defined so every strip shorter than, say, 10 triangles is put in a separate list of single triangles.

elias

princec · March 6, 2003, 4:22pm

There is no need to send triangle strips any more. Because of vertex cacheing there is no performance advantage to using strips, and a great deal of hassle instead of working them out instead of just chucking triangles at the card.

(And don’t use interleavedarrays any more, it’s kinda obsolete)

Cas

gregorypierce · March 6, 2003, 5:23pm

No interleaved arrays ??? What would I use as opposed to that? You have piqued my interest 10 fold because that’s how I’ve always transmitted data to the card

elias · March 6, 2003, 5:35pm

Simply specify the arrays separately, i.e.:

glVertexPointer(…);
glNormalPointer(…);
…

each having their own buffer (and address).

elias

princec · March 6, 2003, 6:59pm

It goes without saying that they have to overlap to look like interleaved arrays Having said that I’m not sure it has any effect at all on AGP transfers.

Cas

gregorypierce · March 6, 2003, 7:23pm

Actually I left that approach for glInterleavedArrays. Why would I want to specify each type separately? A lot of extra work for no gain whereas I get flexible vertex format for free using glInterleavedArrays.

princec · March 6, 2003, 7:50pm

Because the more advanced rendering techniques require some pretty strange combinations of vertex data, and often a vertex doesn’t fit nicely into 32 bytes and so you have to go to 64 etc. IOW, glInterleavedArrays suits a small number of basic rendering operations for single-pass techniques which is rapidly becoming completely obsolete.

Cas

gregorypierce · March 6, 2003, 9:54pm

I haven’t run across these rendering techniques yet. At the moment I’m doing things with fragment shaders and vertex shaders and haven’t run across any need to use any specialized techniques that requiredd me to do something funky to pack vertices. Again, my interest is greatly piqued ??? Where can I find more info about these techniques?

princec · March 7, 2003, 11:01am

Well, look up “bump mapping” for one. The terrain demo used between 5-7 passes to render, and ISTR it had a 96-byte stride for its vertices, each pass requiring a different set of normals, texture coordinates, colours and secondary colours specified (but of course, with the same coordinates).

Even with shaders you’re still going to have to use multitextuyre coordinates and that’s not supported by interleaved arrays either.

I suspect once upon a time a software driver could have made some reasonable optimisations with interleaved arrays but now it’s more of a hindrance than a help.

Cas

cfmdobbie · March 7, 2003, 11:26am

Hmph. Does anyone know any sites which parts of OpenGL aren’t worth bothering with and those that are A Good Thing? ???

Trial and error goes some way to making things fast, but you can never guarantee that you’re doing the best you can or that you haven’t just hit a fast rendering path through your particular accelerator. I recently discovered that setting the GL.TEXTURE_ENV_MODE to anything but GL.MODULATE on my S3 will cause the program to run like a slug in treacle. Why? No idea. Couldn’t find any info on the web about it either. I would have thought GL.DECAL would be the fastest, but no.

Everything used to be about triangles. Triangles triangles triangles. Now Quads are acceptable, even recommended? gl.color3fv isn’t worth it, stick to gl.color3f? A textured quad is better than a call to gl.bitmap? Lighting is slow? Next I’m expecting to hear to glPushMatrix() shouldn’t be used anymore or something… :

gregorypierce · March 7, 2003, 6:52pm

Interesting. I don’t have 5-7 passes for my rendering - just 2 and each pass used the same set of normals, texture coordinares and colors. It may be that I have not run across a situation where using glInterleavedArrays was a disadvantage. In the Cg stuff I’ve done so far all of my in parameters ewere single sets of vertex or texture coords and the register combiners were used to generate any additional coordinates from the maps provided.

There isn’t a lot of work in not using glInterleavedArrays - just gets around the JNI overhead.

princec · March 7, 2003, 7:31pm

There’s nowt wrong with interleavedarrays, it’s just getting obsolete. Might not even be in OpenGL2.0, can’t remember if it is or not.

Charlie - don’t use quads, because the driver just turns them into triangles anyway. What’s more you can multipass triangles on top of triangles without troubles but try drawing quads and then triangles on top and you’ll get z-buffer problems. So better start off with triangles and then you’ve only got 1 kind of primitive to worry about. Use bytes for colours, not floats, to avoid a costly float->byte conversion on the client end. And avoid glPushMatrix() like the plague - use it only when necessary, to shove large pieces of geometry somewhere, like trees - not sprites - because it stalls the GPU pipelines and flushes the vertex caches. There you go.

Cas

cfmdobbie · March 8, 2003, 11:45am

[quote]Charlie - don’t use quads, because the driver just turns them into triangles anyway.
[/quote]
I tend to use triangle fans as a matter of course (less undefined operations if your vertices aren’t quite in-plane ;)). But didn’t you mention that quads fit nicely into current-generation vertex caches? Maybe I imagined it…

[quote]Use bytes for colours, not floats, to avoid a costly float->byte conversion on the client end.
[/quote]
Hrm, okay, that makes sense. I much prefer thinking in the range 0.0…1.0, but if I have to start thinking in 0…255 that’s okay!

[quote]And avoid glPushMatrix() like the plague - use it only when necessary, to shove large pieces of geometry somewhere, like trees - not sprites - because it stalls the GPU pipelines and flushes the vertex caches.
[/quote]
GHA! That was supposed to be a joke! >:( :o

Fortunately my current project-of-the-minute (it’s changed a lot recently ;D) is single screen and 2D so I should be able to avoid them pretty easily. Is manipulating the matrix stack really a no-no now? I’m amazed!

[quote]There you go.

Cas
[/quote]
Much appreciated! I really wish this kind of knowledge was published somewhere. I know it depends very much on platform-dependent features, but there should be a Good Practice guide at least.

Cheers,
Charlie.

gregorypierce · March 9, 2003, 2:58am

In the latest iteration of the specification it is there. It doesn’t really matter how the vertices are staged - to the vertex or fragment program, the vertices or texture coordinates appear as a steady stream of data so whether or not you’ve interleaved your arrays or specified them seperately they will just appear in the in parameter to your shader. No biggie. I now have two vertex buffer types, one that handles interleaved data that I use for everything and another one that has it all seperated out for when I get to something that can’t handle it. Currently I haven’t had any issues with register combiners with interleaved arrays. My current bump mapping is now done in one pass using register combiners on GeForce4 caliber hardware.

princec · March 9, 2003, 9:35am

Ah, you lucky people with big fast video cards

Cas

gregorypierce · March 9, 2003, 6:45pm

Heh, if you will ‘code for hardware’ I’m sure I can arrange to get you a GeForce FX for a certain special piece of rendering code

cfmdobbie · March 10, 2003, 4:47pm

[quote]Use bytes for colours, not floats, to avoid a costly float->byte conversion on the client end.
[/quote]
Sorry, just been wondering about this.

Doesn’t the OpenGL pipeline usually use floats for colour values already? Won’t this just cause the conversion to be done in the driver? Wouldn’t using floats from the start avoid this?

If you’re using bytes for colours, are you then only using values from 0 to 127 and passing them to OpenGL as (signed) GL.BYTEs, or doing something much more devious?

I’m still trying to get my head around all these signed/unsigned conversions… I accept now that it’s better they’re not in there, but it don’t half give me a headache… >:( After many years of programming complex software in Java, I’m finding I’m having to go back to the beginning and work out exactly how a cast works… ;D