VBOs Very Slow

I’m not happy because I’ve just spent a few days converting all my vertex array stuff to use VBOs because apparently they’re much faster. However, a quick test shows that on my Mobility Radeon X600 VBOs are infact massively slower than vertex arrays, to the tune of something like x10!! The same test on an nVidia GeForce 6800 GT shows approximately the same frame rate, possibly slightly faster when using vertex arrays.

This seems ridiculous to me. It’s faster to stream hundreds of gigabytes of data from RAM over the PCI Express bus than it is to read it of the memory on the card!

It seems to me as though ATI and nVidia have implemented VBOs as an afterthought just to claim that they have OpenGL compatibility. It’s also very annoying because VBOs are clearly (logically speaking) the best way to implement vertex array-type structures.

Has anyone else found this to be a problem?

Nope, 99.99% chance it is a design mistake or a bug in your code. VBOs really increase performance, if you keep data static in vRAM. If you update the data every frame (worst case scenario) you get the performance of normal vertex-arrays.

You might want to post some sourcecode (keep it small) that shows what you’re doing.

Only if you’re vertex-bound already though. If your bottleneck is somewhere else then VBOs aren’t going to change anything.

You did benchmark first, didn’t you?

Indeed, but it should really not decrease performance by factor 10. So in this case it’s not so much an issue of finding the bottleneck, but finding the bug.

Hi

I found the same problem, VBOs were much slower than plain on gl triangles. I’d be interested to see what the issue is and if mine is the same.

Endolf

I hope it’s a design problem! Here’s a couple of code snippets…
I’m rendering some terrain. I set up the vertex/colour/normal arrays (which I can render without VBOs at quite high speed), then I bind these to VBOs. The render method renders portions of the arrays simply using glDrawArrays().
I’ve included some of the actual numbers. Essentially I have a terrain grid that’s 400x400 (stored in vertex arrays or VBOs) and I’m rendering a 200x200 section. Vertices are stored as ints, colours as ubytes and normals as floats.

This is the setup (done once only).

int[] vboIds = new int[3];
gl.glGenBuffers(3,vboIds,0);

_vertexBufferIndex = vboIds[0];
gl.glBindBuffer(GL.GL_ARRAY_BUFFER,_vertexBufferIndex);
gl.glBufferData(GL.GL_ARRAY_BUFFER,
_vertexByteSize, // 3849600
_vertexArray,
GL.GL_STATIC_DRAW);
gl.glVertexPointer(3,GL.GL_INT,0,0);

_colourBufferIndex = vboIds[1];
gl.glBindBuffer(GL.GL_ARRAY_BUFFER,_colourBufferIndex);
gl.glBufferData(GL.GL_ARRAY_BUFFER,
_colourByteSize, // 962400
_colourArray,
GL.GL_STATIC_DRAW);
gl.glColorPointer(3,GL.GL_UNSIGNED_BYTE,0,0);

_normalBufferIndex = vboIds[2];
gl.glBindBuffer(GL.GL_ARRAY_BUFFER,_normalBufferIndex);
gl.glBufferData(GL.GL_ARRAY_BUFFER,
_normalByteSize, // 3849600
_normalArray,
GL.GL_STATIC_DRAW);
gl.glNormalPointer(GL.GL_FLOAT,0,0);

gl.glBindBuffer(GL.GL_ARRAY_BUFFER,0);

If I just want to use vertex arrays, this is the setup I use instead.

gl.glVertexPointer(3,GL.GL_INT,0,_vertexArray);
gl.glColorPointer(3,GL.GL_UNSIGNED_BYTE,0,_colourArray);
gl.glNormalPointer(GL.GL_FLOAT,0,_normalArray);

This is the rendering section (called once per frame).
The same code works with vertex arrays and VBOs.

for(int j=0;j<h;j++) // h=200
{
gl.glDrawArrays(GL.GL_TRIANGLE_STRIP,
(802*(y+j))+(x*2), // y=0, x=0
(w+1)*2); // w=200
}

I can’t test on the RX600 at this moment in time, but I just tested on a GF6800 GT. With vertex arrays I get a frame rate of about 32 fps, with VBOs it is about 29 fps. The problem is much more pronounced on the RX600, I’ll post some figures for this later.

Any input greatly appreciated!

I just tested with the Mobility Radeon X600:
Without VBO: 65 fps
With VBO: 1 fps

Something is obviously going very wrong! But what?

(Incidentally, I have the latest drivers, but this is a laptop and so the drivers are provided by a 3rd party, ASUS in this case)

The correct way to bind the VBOs before rendering is:

(copy and paste from my own code - replace whatever should be replaced)


      glBindBufferARB(GL_ARRAY_BUFFER_ARB, vHandle.pointer);
      glVertexPointer(vDim, GL_FLOAT, 0, 0);

      glBindBufferARB(GL_ARRAY_BUFFER_ARB, cHandle.pointer);
      glColorPointer(cDim, GL_FLOAT, 0, 0);

      glBindBufferARB(GL_ARRAY_BUFFER_ARB, nHandle.pointer);
      glNormalPointer(GL_FLOAT, 0, 0);

Ok, I changed my code to this (below), ie. essentially just adding the “ARB” bit. However it appears to make no difference. Could you explain what the significance of ARB is?

gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB,_vertexBufferIndex);
gl.glBufferDataARB(GL.GL_ARRAY_BUFFER_ARB,
_vertexByteSize, // 3849600
_vertexArray,
GL.GL_STATIC_DRAW);
gl.glVertexPointer(3,GL.GL_INT,0,0);

It is NOT essentially adding ARB !

Look at the method-parameters

True, so are you saying I should use:

gl.glBindBufferARB(GL.GL_ARRAY_BUFFER_ARB,_vertexArray);

Which would compare to your:

glBindBufferARB(GL_ARRAY_BUFFER_ARB, vHandle.pointer);

?

Because this method does not exist for these parameters (in my version of jogl, the latest beta release).
The only method is: glBindBufferARB(int target, int id)

It was about:

gl.glVertexPointer(3,GL.GL_INT,0, vbo_handle);

that needed to be changed to

gl.glVertexPointer(3,GL.GL_INT,0,0);

if that doesn’t work, it turns out to be a non-trival bug

I think that’s what I had in the first place:

gl.glVertexPointer(3,GL.GL_INT,0,0);

So I guess it’s not trivial!

Check back on your ‘blue post’:


gl.glNormalPointer(GL.GL_FLOAT,0,_normalArray);

You mean this bit?:


If I just want to use vertex arrays, this is the setup I use instead.

gl.glVertexPointer(3,GL.GL_INT,0,_vertexArray);
gl.glColorPointer(3,GL.GL_UNSIGNED_BYTE,0,_colourArray);
gl.glNormalPointer(GL.GL_FLOAT,0,_normalArray);

As it says above, that’s the setup I’m using when I’m not using VBOs, but using vertex arrays instead (I can easily switch between either method).

Hi chris0,

Just change GL_INT to GL_FLOAT for your vertex positions (you may need to change the actual geometry data too). GL_INT is not a hardware accelerated format, that’s why you see a slowdown. The most propable reason for the slowdown you see is that the data is being read back to memory, converted to floats and then send back to the GPU.

Could you try to bind the VBOs in the rendering-loop?

Or could you provide sourceode of a working example (tiny), so that I can run it on my own machine. Remote-debugging == slow

@Spasi
Thought about it too, but then it’s hard to explain the difference between VAs and VBOs

[quote=“Riven,post:17,topic:27495”]
No readback is necessary with plain vertex arrays.

That seems like a one-time effort, when uploading the data to the VBO. Not something you should notice in the main-loop

Ok, I tried converting to floats…

The GF6800 VBO frame rate has gone through the roof! It has now increased well beyond the monitor refresh rate. I had to up the LOD to get a frame rate less than 60, I estimate about 240fps compared with about 30fps for non-VBO! Excellent!

On the RX600 it has improved things, but only very slightly. The VBO implementation now runs at 3fps (originally 1fps) and the non-VBO implementation has remained about the same at around 65fps.