Ok, I’ll just throw a few things your way, and you can do with them what you will. (Maybe others will jump in as well.)
First, I’d recommend just getting things working, irrespective of performance concerns and so on. In other words, fix the crash (if you haven’t already) and get things rendering (one way or another) using VBOs.
Also, this is tangential (sort of), but I’d recommend anchoring your quads at the center rather than a corner. That is, instead of this:
vertexData.put(new float[]{x, y, x + width, y, x + width, y + height, x, y + height});
You’d do something like this (untested):
final float extentX = width * 0.5;
final float extentY = height * 0.5;
vertexData.put(new float[]{
x - extentX, y - extentY,
x + extentX, y - extentY,
x + extentX, y + extentY,
x - extentX, y + extentY
});
This will be more natural for most things, including rendering.
As for how to go about rendering, there are a lot of different ways to do it, more than could be reasonably summarized here, so I’ll just touch on a couple things.
The simplest approach would be to create a separate VBO for each quad size you need, then for each entity, set the transform and other render state, bind the appropriate VBO(s), and render (this would apply using the programmable pipeline as well). Some drawbacks of this method are a) you’re only storing a little geometry in each VBO, which makes for a lot of VBO overhead and isn’t really how VBOs are best used, and b) you have to make a lot of draw calls, which can be costly on some platforms. It may not even perform better than immediate mode, necessarily, but it at least has the advantage of using more modern techniques.
The more usual method addresses both of the issues mentioned above, but naturally is more complex to implement. The general idea is to batch your geometry so as to make a minimum number of draw calls. This basically necessitates transforming the geometry yourself. This is pretty simple if all you’re doing is translating and maybe scaling, but for more complex transforms you’ll probably want to have a good math library available.
The procedure goes something like this (off the top of my head, so I may not hit everything):
- On startup, create a set (could be just one, depending) of VBOs in ‘stream’ mode, of sufficient size to hold the largest batch you ever expect to render in one go. If you’re not sure how large the largest batch will be, you can grow the buffers as needed, or limit batch sizes.
- We’ll set aside texture atlases for now - they can make for better batching, but you can do plenty of optimization without them.
- For rendering, you want to sort everything you’re going to render by state so as to minimize how often you have to break batches. Things that break batches are anything that can’t change in the middle of a draw call (more or less), like blend mode, textures, and programs. So for example, let’s say you have a bunch of entities that use texture A, and a bunch of entities that use texture B. Assuming there’s nothing else that would require you to split them up, you’d want to render all the entities with texture A, then all the entities with texture B. Note that even without batching this can be a win if your rendering system skips redundant state changes, because you’re not changing texture state as often.
- For a 2-d game, you may also have to factor in layering (i.e. painter’s algorithm) in your render order. Or, you could use the z-buffer and draw things at different z depths.
- For each batch (same blend mode, texture, etc.), you pre-transform all the geometry and put it in your VBOs (e.g. using glBufferSubData()). Then you make your draw call.
Again though, I’d start with just getting what you have now working, and then make incremental improvements.