Buffer Objects with glDrawElements

anarchotron · September 25, 2005, 5:02pm

Howdy.

I have an engine that up until today have been using Vertex Arrays for various vertex data, and glDrawElements() to pump my Index Buffers into GL for primitive rendering.

I updated my Vertex Arrays to use Vertex Buffer Objects (VBO) and got a noticable speed improvement (~20%). Right now I just set everything to use GL_STATIC_DRAW_ARB and I don’t permit code to modify the Vertex Arrays after initialization.

I applied a similar technique to my Index Buffers, however when I use this technique it renders properly, but at roughly half the speed as before. I’m using glDrawElements rather than glDrawRangeElements because in every case:

I want to render using the entire index buffer
I am already managing the data format of the index based on the range of index values (ByteBuffer, ShortBuffer, IntBuffer, etc…) so I am already passing in the most efficient data format for my indices.

Incidentally I did switch my glDrawElements() calls over to the equivalent glDrawRangeElements() with start=0, end=range, count=count but I still get the same poor performance.

Has anyone had problems using Index Buffers with Buffer Objects? Please let me know. I’ll paste in a code snippet below.


    private boolean loaded = false;

Here the bufferId has already been generated. My ‘indexBuffer’ member is stored as a Buffer reference but was initialized with a ByteBuffer, ShortBuffer, or IntBuffer as appropriate when the Index Buffer itself was initialized.


    public void loadBuffer()
    {
        loaded = true;
        ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, bufferId);

        switch (getType())
        {
            case Type.T_byte:
                ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
                        (ByteBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
                break;
            case Type.T_short:
                ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
                        (ShortBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
                break;
            case Type.T_int:
                ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
                        (IntBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
                break;
        }

        int err = GL11.glGetError();
        if (err != GL11.GL_NO_ERROR)
        {
            Warning.emit("IndexArray load buffer GL Error: " + GLU.gluErrorString(err));
        }

    }

And finally here’s the code path for rendering. The Vertex Buffers have already been setup, all we need to do here is blast the indices. If the ‘loaded’ member is false then we use the old code path, which performs at twice the speed as the new code path.

The ‘mode’ variable is of course the render mode, GL_TRIANGLES, GL_QUADS, etc…


    public void drawElements(int mode)
    {
        if (!loaded)
        {
            ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, 0);

            switch (getType())
            {
                case Type.T_byte:
                    GL11.glDrawElements(mode, (ByteBuffer) indexBuffer);
                    break;
                case Type.T_short:
                    GL11.glDrawElements(mode, (ShortBuffer) indexBuffer);
                    break;
                case Type.T_int:
                    GL11.glDrawElements(mode, (IntBuffer) indexBuffer);
                    break;
            }
        }
        else
        {
            ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, bufferId);

            switch (getType())
            {
                case Type.T_byte:
                    GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_BYTE, 0);
                    break;
                case Type.T_short:
                    GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_SHORT, 0);
                    break;
                case Type.T_int:
                    GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_INT, 0);
                    break;
            }

        }
    }

Thanks for any help you can provide. It is quite baffling that this change would actually decrease performance. Pushing the indices through JNI into OpenGL has always been the greatest bottleneck in my app, as indicated by JProbe. I had hoped that this Buffer Object usage would allow OpenGL to keep my indices on the hardware and improve performance.

Thanks!

Spasi · September 25, 2005, 5:48pm

I don’t have support for UNSIGNED_BYTE indices in Marathon, but I don’t remember why I made that decision. ;D

I’m guessing that UNSIGNED_BYTE is not an optimal format for vertex indices. GPUs should be optimized for UNSIGNED_SHORTs, so try replacing byte with short indices, even if the model has less than 256 vertices.

anarchotron · September 25, 2005, 7:49pm

Good call.

I tried that and running with SHORTsinstead of BYTEs renders the same performance as without buffer object indices. For grins I tried forcing everying to INTs and got the same result.

So at least now the performance is on par with what is was before. I’m a bit suprised that performance didn’t actually increase with index Buffer Objects. As I mentioned before, performance did increase noticably when I converted my Vertex Arrays to VBOs.

Anyway, thanks for the help!

Spasi · September 25, 2005, 11:10pm

Cool. Actually, I’ve seen this before with bone indices. Lousy performance with bytes, but goes full speed with shorts (although only 20-30 bones are necessary).

[quote=“anarchotron,post:3,topic:24773”]

[quote=“anarchotron,post:1,topic:24773”]
Hmm, I’m seeing this in Marathon too, but I’m sure the data transfer is not the bottleneck. When indices are accessed, the whole rendering pipeline goes to work and you can’t be sure what you’re actually measuring. It’s really difficult for geometry transfer to become a bottleneck for modern GPUs, especially with VBO.