VBO's - teach me please.

Hi.

So, I’ve started writing a game on the Android platform. Everything is working out quite well so far, except one [major] thing: Performance.
I’ve always been slow on catching up on the OpenGL stuff since I’ve always worked under the philosophy “Optimize later” (and never really had to). But on Android, “later” is “earlier” it seems.

I’m working on a standard 800480 display, and I am filling it with a varying number of 1616 textured quads (rendered as GL_TRIANGLES) (Ranging from 0 quads to filling the whole screen). I’ve been using Vertex Arrays with index pointers and glDrawElements, and I’ve tried several approaches that work from very poor to pretty poor.

Approach 1: Vertex array for one quad, located at 0,0, - render each of the textured quads by translating to their location.
Approach 2: Large vertex array to hold all quads with absolute location, render it all in one go.
Approach 3: Bundling the 16*16 quads where possible (forming larger quads, and using the texture’s s&t repeats to fill them).

Approach 1 is horrible. I wasn’t too surprised about that, but I figured I’d give it a go. Performance on approach 2 and 3 vary abit, depending on the composition of the quads, and their quantity. But none of the approaches perform sufficiently.

I manage about ~15 FPS on my HTC Hero, when the screen is full.

So, I figured I would give VBO’s a shot. I know they aren’t too far from Vertex Arrays in regards to implementation, but I’m not really sure how they differ. (Vertex arrays have always been more than good enough for me on the PC platform, so I was always too lazy to learn VBO’s).

I’m looking for some help on how to implement them properly, and how they’re incorporated and used in the following points.

-> Surface creation
-> Surface resizing
-> Setting up rendering for an entity
-> Rendering the entity
-> Cleaning up after the rendering

I should note that in my specific problem, my grid of textured quads is dynamic, in that it changes based on user action. It does not however, change on every cycle.

Thanks for any help offered :slight_smile:

Hi!

Your first approach is not efficient, do not use a VBO to contain a single quad, it is often even worse than immediate rendering. Use static VBOs for things that rarely change and dynamic VBOs for things that changes frequently. Avoid creating too small VBOs. If you don’t need to modify some VBOs very often, don’t put the data in a new NIO buffer, rather reuse the data store of the GPU (use glMapBuffer if possible). Avoid mixing VBOs with vertex arrays, display lists and immediate mode.

Hi Goussej.

Thanks for answering. To clarify, I am NOT using VBO’s in any form right now, - just basic Vertex Arrays. The reason is simple; I’m not sure how to implement VBO’s. I’ve of course found a plefora of examples on the web, but few of them seems to be implemented identically, so I’m a little unsure on how to approach them. Not to mention that I’d like to understand how they actually work in theory as well.

That’s why I’m asking the experts here :wink:

If you could provide an example of VBO implementation (the more comments, the better!) I’ll be grateful.

VBOs work just like vertex arrays except that the data is on the graphics card. The only extra steps you need to use VBOs is creating them (using commands like glGenBuffers and glBufferData), and you have to bind them before calling glVertexPointer or glTexPointer. This is done with glBindBuffer(target, id) much like how you bind a texture (the target refers to if the VBO is vertex data or used for indices).

When you actually use glVertexPointer and that family of functions, instead of passing in a Buffer, you will pass in a byte offset into the currently bound VBO (which you usually will want to be 0). A VBO stays bound until you bind something else or unbind it (using VBO id = 0) and you cannot call glVertexPointer (etc.) with a vertex array when a VBO is bound, so if you’re mixing you’ll want to make sure to unbind the VBOs when you’re done rendering with them.

There aren’t many differences between VBOs and vertex arrays. Also, if the examples online differ I would just pick one to start with since they are likely all almost the same. If you have problems with one example, you can post that with more specific problems.

Great explanation, thanks! I’ll give my code a spin later tonight, and post a specific example if I run into problems. Again, thanks.

Look at the source code of TUER, I had written some wrapper classes for these things:
http://tuer.svn.sourceforge.net/viewvc/tuer/alpha/drawer/StaticVertexBufferObject.java

Thanks for the help so far :slight_smile:

I currently have my VBO’s drawing their stuff now, but sadly I’m still stuck at a sucky FPS. Perhaps the phones just aren’t up to it…

Another question; I noticed glBufferSubData. Are there any known performance issues with using glBufferSubData contra glBufferData ? It seems [at least in my case] that it would be cleaner and easier to create buffers early, and bind them with a given size to a null buffer, and later push data to them using glBufferSubData. Thoughts on this?

Thank you for sharing this piece of code, Julian . I reached a point yesterday where I needed to use VBO’s but was very very lazy to look for some code, glad it appeared magically here .

ok now one question . If I’m using lots of glRotate, glTranslate before rendering stuff, can I use VBO’s anyway ? will the rotations and transaltions apply to the vertex data I sent to the buffer in the loading phase ?

@Addictman: As long as you use glBufferData to allocate space on the graphics card, you can pass in a null buffer and fill it in with glBufferSubData. What I like to do is use glBufferData the first time and pass in the data directly, and then any edits (even if it’s the full buffer) use glBufferSubData. I don’t know if there is a performance hit or not, but it could be cleaner to do it your way.

@teletubo: You can use glRotate and glTranslate with VBOs. Those are matrix functions that change the current transform that is applied to the vertices and normals. This is independent of how you specify the vertices (either with vertex arrays, vbos or using glVertex, etc.)

Excelent, thanks !

So I guess I’ve screwed up somewhere in my code, because after I changed to VBO things just disappeared (though I get no errors).

Do not use glBufferSubData on Android. The classes in our libgdx repository should give you some hint how to work with VBOs in an OpenGL ES context (no glMapBuffer etc…).

Your performance is unlikely to be related to whether you use VAs or VBOs. The hero is heavily fill-rate limited (like all other current Android devices). You’ll have to tweak your texture filters/sizes. A tile map of 16x16 tiles is possible at 40fps on a Hero. Also, screen aligned quads (composed of two triangles) will trigger a fast path on G1 level hardware (Hero, Dream etc.). Don’t rotate your stuff so that that gets triggered.

glBufferSubData is nice to update a relatively small (<32KB, implementation-dependent) region of a VBO (but it has a poor implementation on some mobile phones). You can use it to update the whole data store of a VBO of course. The drawback is that you have to keep a NIO buffer in memory permanently to perform the updates (destroying explicitly a direct buffer is not the thing to do very frequently, look at sun.misc.Cleaner to do so) but it can be faster than mapping a whole VBO into virtual memory (look at glMapBuffer and glUnmapBuffer) to perform only a few updates. glMapBuffer & glUnmapBuffer allows to map a whole VBO into memory, it is less flexible than glBufferSubData as you cannot map/unmap only a region with this pair of calls but it allows to perform updates without having to store a permanent copy of the VBO.

Some people pretends that using glBufferData twice (once with null) to update a whole data store is faster than calling glBufferSubData but such a trick may force the graphics card to destroy a data store and then create a new one, I’m not sure it is a good idea.

As a conclusion, if you don’t need to update your data very often, use glBufferData with null as a first step, use glMapBuffer to be able to modify the data store of the VBO and use glUnmapBuffer when your update is finished. If you use glBufferSubData to push your data only once, you will have to create a NIO buffer (instead of using the direct byte buffer returned by glMapBuffer) that you will use only once which is a bit inefficient even though it would work.

Edit.: I have not yet used glMapBuffer on my G1…

Read this guy’s blog.

Cas :slight_smile:

Thanks for continued tips everyone! :slight_smile:

I just came home from Barcelona, where the best friends anyone can have kidnapped me for my Bachelor’s party/weekend. My brain is still mush, but I’ll digest tidbits on everyone’s advice and report back (probably with more questions), once I’ve done some further tests.

On a quick note, how you get 40 FPS on a Hero phone, with your screen filled with 16x16 tiles is beyond me. I get about 35 FPS without really doing anything but clear the buffer on each cycle :wink: (which, I know is pretty curious once I found out).

Generating and binding buffers …


        GL11 gl11 = (GL11)GLSettings.GL;
			
	int[] tempBuffer = new int[1];
		
	gl11.glGenBuffers(1, tempBuffer, 0);
	vertexBufferID = tempBuffer[0];
	gl11.glBindBuffer(GL11.GL_ARRAY_BUFFER, vertexBufferID);
        // Load The Data
        gl11.glBufferData(GL11.GL_ARRAY_BUFFER, getVertexDataSize(), vertexBuffer, GL11.GL_STATIC_DRAW);
        // Generate And Bind The Texture Coordinate Buffer
        gl11.glGenBuffers(1, tempBuffer, 0);  
        textureBufferID = tempBuffer[0];
        gl11.glBindBuffer(GL11.GL_ARRAY_BUFFER, textureBufferID); 
        // Load The Data
        gl11.glBufferData(GL11.GL_ARRAY_BUFFER, getTexCoordsDataSize(), textureBuffer, GL11.GL_STATIC_DRAW);

        gl11.glGenBuffers(1, tempBuffer, 0);
        indexBufferID = tempBuffer[0];
        gl11.glBindBuffer(GL11.GL_ELEMENT_ARRAY_BUFFER, indexBufferID);
        gl11.glBufferData(GL11.GL_ELEMENT_ARRAY_BUFFER, getIndexDataSize(), indexBuffer, GL11.GL_STATIC_DRAW);


Rendering …


        GL11 gl11 = (GL11)GLSettings.GL;
	gl11.glPushMatrix();
        
	gl11.glEnable(GL11.GL_TEXTURE_2D);
	gl11.glBindTexture(GL11.GL_TEXTURE_2D, textureID );
		
	// Enable Vertex Arrays
	gl11.glEnableClientState(GL11.GL_VERTEX_ARRAY);  
        // Enable Texture Coord Arrays
        gl11.glEnableClientState(GL11.GL_TEXTURE_COORD_ARRAY);  
        
	gl11.glBindBuffer(GL11.GL_ARRAY_BUFFER, vertexBufferID);
        gl11.glVertexPointer(3, GL11.GL_FLOAT, 0, 0);
        
        gl11.glBindBuffer(GL11.GL_ARRAY_BUFFER, textureBufferID);
        gl11.glTexCoordPointer(2, GL11.GL_FLOAT, 0, 0);
        
        gl11.glBindBuffer(GL11.GL_ELEMENT_ARRAY_BUFFER, indexBufferID);
        
        gl11.glDrawElements(GL11.GL_TRIANGLES, numEntities * vertexConfig.numIndexesPrEntity, GL11.GL_UNSIGNED_SHORT, 0);
        
        // Unbind buffers
        gl11.glBindBuffer(GL11.GL_ELEMENT_ARRAY_BUFFER, 0);
        gl11.glBindBuffer(GL11.GL_ARRAY_BUFFER, 0);
        // Disable states
        gl11.glDisableClientState(GL11.GL_VERTEX_ARRAY);  
        gl11.glDisableClientState(GL11.GL_TEXTURE_COORD_ARRAY);  
        gl11.glDisable(GL11.GL_TEXTURE_2D);
        
	gl11.glPopMatrix();

When these buffers are filled with vertices to fill my screen with 16*16 textures, I get a whooping 10 FPS (all other updates, and all other rendering commented out). Something must surely be wrong. Can anyone see any glaring errors here?

As i said above, your problem is unlikely to be the transforms and hence amount of vertex data, but your textures. What size are they?

Yes, because that is not supported in OpenGL ES 1.x nor 2.0. Also, your advice of using glBufferSubData will be shit on ALL current Android phones.

glMapBufferOES is not supported by Android GL but it is in OpenGL ES 1.x and 2.0:
http://www.khronos.org/registry/gles/extensions/OES/OES_mapbuffer.txt
It does not use exactly the same extension.

Ok, it’s only fine on computers, thanks for the tip.

16*16