Vbo help

Hi all.
I am developing a benchmark to find if it’s better to use display lists or vbos for future private purposes.

The benchmark is quite simple, it consists in a window where a static model of about
4000 triangles is drawed multiple times.
First i stored my model data in a display list and then in my render i called that dl 1000
times per frame. I got a result of about 37 fps.

Then i tried to use Vbos to draw my model but i am quite new with vbos so i think i’ve just got
something wrong becouse drawing 1000 models on screen i go a result of 2 to 5 fps.

I just followed some tutorials to implements vbos and i used the following code in my vbo model draw method:


gl.glEnableClientState(GL.GL_VERTEX_ARRAY);		// Enable Vertex Arrays
gl.glEnableClientState(GL.GL_TEXTURE_COORD_ARRAY);	// Enable Texture Coord Arrays
gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOTexCoords[0]);
            
gl.glTexCoordPointer(2, GL.GL_FLOAT, 0, 0);		// Set The TexCoord Pointer To The TexCoord Buffer
gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOVertices[0]);
            
gl.glVertexPointer(3, GL.GL_FLOAT, 0, 0);
gl.glEnable(GL.GL_TEXTURE_2D);
gl.glDrawArrays(GL.GL_TRIANGLE_STRIP, 0, getTrianglesNumber()*3);
gl.glDisableClientState(GL.GL_VERTEX_ARRAY);					// Disable Vertex Arrays
gl.glDisableClientState(GL.GL_TEXTURE_COORD_ARRAY);	

Did i do something wrong or is it common that with many triangles my performances drop significantly using vbos?

there is multiple calls to bind buffer that do the same thing.

think it should look something like this, sry my actual VBO code doesnt use arrays, so i dont know if its correct


gl.glEnable(GL.GL_TEXTURE_2D);

gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOTexCoords[0]);

gl.glEnableClientState(GL.GL_TEXTURE_COORD_ARRAY);	// Enable Texture Coord Arrays
gl.glTexCoordPointer(2, GL.GL_FLOAT, 0, 0);		// Set The TexCoord Pointer To The TexCoord Buffer

gl.glEnableClientState(GL.GL_VERTEX_ARRAY);		// Enable Vertex Arrays
gl.glVertexPointer(3, GL.GL_FLOAT, 0, 0);

gl.glDrawArrays(GL.GL_TRIANGLE_STRIP, 0, getTrianglesNumber()*3);

gl.glDisableClientState(GL.GL_VERTEX_ARRAY);					// Disable Vertex Arrays
gl.glDisableClientState(GL.GL_TEXTURE_COORD_ARRAY);


maybe change GL.GL_TRIANGLE_STRIP to GL.GL_TRIANGLES (unless you ment to use strips)

glDrawElements or glDrawRangedElements generally gives better performance than glDrawArrays in my experience. The largest benefit to vbo’s IMO is that you only have to bind the data once in the beginning, then you can make 1000 glDrawX calls (assuming you want the same geometry rendered each time). This enables the graphics card to cache a lot of results, generally, I see very good performance (although I haven’t tried using display lists).

I’m not a big fan of vbo. It is way to complicated to use. And all the calls needed reduces performance aswell. If the geometry is static display list are easier to use and just as fast, or faster.

There are situations where vbo can be faster. Like keeping all the static geometry in one big vbo. Then streaming the visible (culled) triangle indices.

I think this is the same benefit of DL, storing data in the gc and draw data accessing them by id.
What i just want to realize if i can have better performances using Vbo in stand of Dl but i think i got something wrong because it’s quite strange to have a fps of 2-5 using vbos and a fps of 37 using dl.
Maybe i do not understand the right way to call gc stored data through vbo, if I tried to call only gl.gldrawarrays i can’t see anything on screen. Every time i want to draw my model i have to call gl.glEnableClientState and it’s quite strange for me, why do i have to enable client state if i’ve just stored my datas in the graphic card?
Can’t i simply access those data using the index gave by glbindbufferARB?

Some graphics cards treat VBOs (static and dynamic) as vertex arrays whereas they should treat only dynamic VBOs as vertex arrays and static VBOs as display lists. VBOs are only a way of rewriting the API and adding a few features. It explains why you have only 5 FPS.

When you bind a VBO and you use it several times, you obtain the same performance than with a compiled vertex array.

goueseej: Do you have any evidence to back your claims of certain cards treat VBOs as VAs ? I have not a single, recently made (From the AT9500 and up, NVidia 5900 and up) that do that…

Eh? Ofcourse your going to use it several times…infact, once per frame! And there are loads of frames…Regarding compiled vertex array, that extension has long been deprecated infavour of VBOs…

Regarding VBO API, to me, they conform to OpenGL “standards” better than display lists, especially when you compare VBOs, PBOs to Texture uploads…create the id, upload, use ID…

DP :slight_smile:

I believe VBOs are much closer (if identical) to the way opengl 3 will handle things, so it could make porting code easier.

Yes, I have an evidence. I had found a pointer on www.opengl.org speaking about this and I made some tests on my own game that uses huge VBOs (it is bad, I could avoid it…). When I use vertex arrays, I have exactly the same performance. This problem happens mainly on the oldest cards that support VBO through the ARB extension, mine for example (ATI Radeon 9250 Pro). If I perform the same test on more recent cards, I have a noticeable difference of performance, static VBOs are then 4 times faster than vertex arrays.

You should read this: http://oss.sgi.com/projects/ogl-sample/registry/ARB/vertex_buffer_object.txt
The aim of this extension is to revamp the whole API to drive it easier to use. They spoke about display lists because VBOs use them too.

I do believe your tests are flawed. I had a 9200se back in the days and VBOs made a considerable difference. Not in benchmarks, but in games…real life situations. I’ve read that document many times, made several game engines and not one of them, in game situations did VAs out strip VBOs…

And yes, I am using the ARB extension in my engines, simply because they call the same pathways as core VBOs in OpenGL1.5.

Im sorry, that is just stupid, there reasons for GL12.GL_MAX_ELEMENTS_VERTICES and GL12.GL_MAX_ELEMENTS_INDICES; and if your not obeying them, thats your own bloody fault.

DP :slight_smile:

I use huge VBOs but I respect this limit (GL_MAX_ELEMENTS_VERTICES) when I draw them by using glDrawArrays (notice: I use JOGL, not LWJGL). I don’t say that this problem is reproducible on all graphics cards using ARB extension but I reproduced it on many of them. Therefore, my tests are not flawed.

I have the problem that i need to draw in my scene a variable number of instances of the same mesh of about 2500 triangles. In your opinion is it better to use a display list or a vbo in that case?

I tried to use gl.glDrawElements in stand of gl.glDrawArrays but i got an exception that tells me:

Caused by: javax.media.opengl.GLException: element vertex_buffer_object must be enabled to call this method

It’s quite strange, why do i have this exception using gldrawelements and not with drawarrays?

Display list if the mesh is static, otherwise VBO. Recompiling a display list is so slow it can cause stutter.

You can use static VBOs instead of display lists, it depends on your graphics card too.

How can i use them in static mode? I just followed some tutorials online to lear how to use vbo and the resulting code is:

gl.glGenBuffersARB(1, VBOVertices, 0);                                      // Get A Valid Name
            gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOVertices[0]);			// Bind The Buffer
            
            // Load The Data
            gl.glBufferDataARB(GL.GL_ARRAY_BUFFER_ARB, trianglesNumber * 9 * BufferUtil.SIZEOF_FLOAT, vertices, GL.GL_STATIC_DRAW_ARB);

            // Generate And Bind The Texture Coordinate Buffer
            gl.glGenBuffersARB(1, VBOTexCoords, 0);                                     // Get A Valid Name
            gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB, VBOTexCoords[0]);		// Bind The Buffer
            // Load The Data
            gl.glBufferDataARB(GL.GL_ARRAY_BUFFER_ARB, trianglesNumber * 6 * BufferUtil.SIZEOF_FLOAT, texCoords, GL.GL_STATIC_DRAW_ARB);

then to call vbo to draw i just used the code i posted previously but i think i did something wrong becouse i have a significant drop in fps using vbos in stand of dl and if i use the call drawelements i got and exception. I am a bit miss.

glDrawElements and glDrawArrays process the vertices/normals/etc in a different way. glDrawArrays runs through the bound buffers in order, grouping each subsequent 3 or 4 elements into a triangl/quad(depending on what you say). glDrawElements takes an IntBuffer or ShortBuffer that specifies the indices to use (so multiple faces could share the same vertex if they have also have the same normal, and tex coord). When it gets the indices, it runs through those in order, looking up the correct element based on the index value.

When calling glDrawElements, there are two commands, one that takes a buffer argument and another that takes an integer. The integer value requires there to be a vbo bound to the ELEMENT_ARRAY_BUFFER target (but otherwise identical to regular vbo use) with an UNSIGNED_X integer type. If no buffer is bound that command doesn’t make sense.

I’m going back on my previous statement that glDrawElements() can give better performance. It depends on the geometry given and the graphics card drivers.

Umm…what ? GL_MAX_ELEMENT_VERTICES is not a restriction on glDrawArray, its a restriction on teh size of a VBO, which on nv is 4096. This has nothing to do with JOGL vs LWJGL…your confusing the issues…

AFAIK, glDrawArray and glDrawElements give the same performance, its glDrawRangeElements that gives better performance; thats what the specs say anyways…

DP :slight_smile:

You’re right, I have found this:

“Most OpenGL implementations can provide maximum performance only if
the number of vertices stored in a vertex array is below an implementation-
specific threshold. An application can store as many vertices in a vertex
array as necessary, but if the number exceeds this threshold, performance
could suffer.
Applications can query this implementation-specific threshold with the
following code:
GLint maxVerts;
glGetIntegerv( GL_MAX_ELEMENTS_VERTICES, &maxVerts );
When using glDrawRangeElements(), the minimum and maximum
index parameters, start and end, should specify a range smaller than
GL_MAX_ELEMENTS_VERTICES to obtain maximum performance.”

Thank you very much. It might explain why my game is so slow. When the space subdivision system is ready, I will then use smaller VBOs. I should sleep more and program less ;D

Now, I’ve tried to use smaller VBOs, I split the level into about 150 VBOs and it doesn’t change anything. I use an ATI Radeon 9250 Pro, I have almost exactly the same frame rate :frowning:

Read this: http://osdir.com/ml/video.dri.user/2004-07/msg00072.html
"That GL_MAX_ELEMENTS_VERTICES (and GL_MAX_ELEMENTS_INDICES) is just a performance hint, and ATI’s OGL implementation seems to suggest that there is no performance hit when using insane amounts of vertex data per vertex array. By the way, these two constants are only relevant for glDrawRangeElements. "