In the constant pursuit of improving peformance of my engine, I discovered something very surprising - Vertex Arrays aren’t actually any faster then immediate rendering.
For shits and giggles, I removed the call to the vertex arrays and called the immediate mode renderer method instead. My FPS actually increased by 1!
Here is the Array code:
// Vertices
gl.glEnableClientState(GL.GL_VERTEX_ARRAY);
gl.glVertexPointer(3, //3 components per vertex (x,y,z)
GL.GL_FLOAT, 0, vertex_buffer);
// Color 3 (OPTIMIZATION ASSUMPTION: Only 1 color per object)
if (color_buffer3 != null) {
gl.glColor3f(color_buffer3.get(0),color_buffer3.get(1), color_buffer3.get(2));
}
// Color 4
if (color_buffer4 != null) {
gl.glEnableClientState(GL.GL_COLOR_ARRAY);
gl.glColorPointer(4, //4 components per color (r,g,b,a)
GL.GL_FLOAT, 0, color_buffer4);
}
// Texture coordinates 2D
if (coords_buffer1 != null) {
gl.glEnableClientState(GL.GL_TEXTURE_COORD_ARRAY);
gl.glTexCoordPointer(2, //2 components coord
GL.GL_FLOAT, 0, coords_buffer1);
}
// Normals
if (normal_buffer != null) {
gl.glEnableClientState(GL.GL_NORMAL_ARRAY);
gl.glNormalPointer(GL.GL_FLOAT, 0, normal_buffer);
}
gl.glDrawArrays(GL.GL_TRIANGLES, 0, vertex_buffer.limit()/3);
// Reset client state
if (normal_buffer != null)
gl.glDisableClientState(GL.GL_NORMAL_ARRAY);
if (coords_buffer1 != null)
gl.glDisableClientState(GL.GL_TEXTURE_COORD_ARRAY);
if (color_buffer3 != null || color_buffer4 != null)
gl.glDisableClientState(GL.GL_COLOR_ARRAY);
gl.glDisableClientState(GL.GL_VERTEX_ARRAY);
This is the immediate mode renderer code:
// Render
gl.glBegin(GL.GL_TRIANGLES);
// engine only uses one color per geo object, so all verts should
// have the same color, UNLESSS multi_color is explicitly set
if (color_buffer3 != null && !multi_color) {
gl.glColor3f(color_buffer3.get(0), color_buffer3.get(0 + 1),
color_buffer3.get(0 + 2));
}
if (color_buffer4 != null && !multi_color) {
gl.glColor4f(color_buffer4.get(0), color_buffer4.get(0 + 1),
color_buffer4.get(0 + 2), color_buffer4.get(0 + 3));
colors4 += 4;
}
for (int i = 0; i < vertex_buffer.limit(); i += 3) {
if (multi_color) {
if (color_buffer3 != null) {
gl.glColor3f(color_buffer3.get(i), color_buffer3.get(i + 1),
color_buffer3.get(i + 2));
}
}
if (normals != null)
gl.glNormal3f(normal_buffer.get(i), normal_buffer.get(i + 1), normal_buffer.get(i + 2));
if (app != null && app.getTexture() != null)
if (coords_buffer1 != null) {
gl.glTexCoord2f(coords_buffer1.get(texInd), coords_buffer1
.get(texInd + 1));
texInd += 2;
}
gl.glVertex3f(vertex_buffer.get(i), vertex_buffer.get(i + 1), vertex_buffer.get(i + 2));
}
gl.glEnd();
The effect of both is identical in terms of what they render (actually the immediate mode can handle colors per vertex…but I don’t use them anyway) and how they get their data, but the immediate mode runs 1 FPS faster!
Is this an evil of NIO? Is it that under the hood copying the data to the card from the NIO buffer takes just as long as it would to call the glXXXXX methods anyways?