Howdy.
I have an engine that up until today have been using Vertex Arrays for various vertex data, and glDrawElements() to pump my Index Buffers into GL for primitive rendering.
I updated my Vertex Arrays to use Vertex Buffer Objects (VBO) and got a noticable speed improvement (~20%). Right now I just set everything to use GL_STATIC_DRAW_ARB and I don’t permit code to modify the Vertex Arrays after initialization.
I applied a similar technique to my Index Buffers, however when I use this technique it renders properly, but at roughly half the speed as before. I’m using glDrawElements rather than glDrawRangeElements because in every case:
- I want to render using the entire index buffer
- I am already managing the data format of the index based on the range of index values (ByteBuffer, ShortBuffer, IntBuffer, etc…) so I am already passing in the most efficient data format for my indices.
Incidentally I did switch my glDrawElements() calls over to the equivalent glDrawRangeElements() with start=0, end=range, count=count but I still get the same poor performance.
Has anyone had problems using Index Buffers with Buffer Objects? Please let me know. I’ll paste in a code snippet below.
private boolean loaded = false;
Here the bufferId has already been generated. My ‘indexBuffer’ member is stored as a Buffer reference but was initialized with a ByteBuffer, ShortBuffer, or IntBuffer as appropriate when the Index Buffer itself was initialized.
public void loadBuffer()
{
loaded = true;
ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, bufferId);
switch (getType())
{
case Type.T_byte:
ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
(ByteBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
break;
case Type.T_short:
ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
(ShortBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
break;
case Type.T_int:
ARBBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB,
(IntBuffer) indexBuffer, ARBBufferObject.GL_STATIC_DRAW_ARB);
break;
}
int err = GL11.glGetError();
if (err != GL11.GL_NO_ERROR)
{
Warning.emit("IndexArray load buffer GL Error: " + GLU.gluErrorString(err));
}
}
And finally here’s the code path for rendering. The Vertex Buffers have already been setup, all we need to do here is blast the indices. If the ‘loaded’ member is false then we use the old code path, which performs at twice the speed as the new code path.
The ‘mode’ variable is of course the render mode, GL_TRIANGLES, GL_QUADS, etc…
public void drawElements(int mode)
{
if (!loaded)
{
ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, 0);
switch (getType())
{
case Type.T_byte:
GL11.glDrawElements(mode, (ByteBuffer) indexBuffer);
break;
case Type.T_short:
GL11.glDrawElements(mode, (ShortBuffer) indexBuffer);
break;
case Type.T_int:
GL11.glDrawElements(mode, (IntBuffer) indexBuffer);
break;
}
}
else
{
ARBBufferObject.glBindBufferARB(ARBVertexBufferObject.GL_ELEMENT_ARRAY_BUFFER_ARB, bufferId);
switch (getType())
{
case Type.T_byte:
GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_BYTE, 0);
break;
case Type.T_short:
GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_SHORT, 0);
break;
case Type.T_int:
GL11.glDrawElements(mode, count, GL11.GL_UNSIGNED_INT, 0);
break;
}
}
}
Thanks for any help you can provide. It is quite baffling that this change would actually decrease performance. Pushing the indices through JNI into OpenGL has always been the greatest bottleneck in my app, as indicated by JProbe. I had hoped that this Buffer Object usage would allow OpenGL to keep my indices on the hardware and improve performance.
Thanks!