Pooling VertexArrayObjects

Hello,

I’m tinkering around with modern OpenGL (using vaos w/ indexed vbos) and efficient usage of vbos. My (hopefully reasonable) goal is to minimise the amount of vaos and in consequence less switching between vaos (which - as I understand it - is really expensive performancewise).

So the main difference between different vaos is the data storage namely the type (GL_POINT, GL_LINE, GL_TRIANGLE) and the attributes (datatype, name, location).

I observed that (at least) in my applications certain types of vaos are used very frequently:

  • [GL_TRIANGLE, (3x4 float Attrib)]
  • [GL_TRIANGLE, (4x4 float Attrib)]
  • […]

Currently I have quite a decent amount of different vaos despite an overlap of this distinctive vao characteristic. So I thought about creating exactly one vao for each of these commonly used specifications and applying different shader programs with different uniforms as I need them. In other words: pooling commonly used vao specifications.

Possible downsides are:

  • increased shader program switching
  • forcing a standardised shader attrib naming for the pooled vaos and all shader programs I want to use with them
  • significant more effort required write the code manageing the pool
  • can you point out some more?

Do you think that might work at all? In my applications that would reduce the amount of vaos from ~80 to ~30 vaos which is quite a huge number. Do you think the performance increase is worth the hassle?

Thanks in advance!

VAOs are tied to the buffer you used, so you need one VAO per set of buffers. When you bind a VAO, OpenGL pretty much binds the VBO(s) and assigns the same vertex attribute pointers you bound when creating it.

Let’s say you create a VAO and tell OpenGL that some vertex attributes will be coming from two VBOs.


vao = glGenVertexArray();
glBindVertexArray(vao);

//{
glBindBuffer(GL_ARRAY_BUFFER, vbo0);
glVertexAttribPointer(attrib0, ...);
glEnableVertexAttribArray(attrib0);

glBindBuffer(GL_ARRAY_BUFFER, vbo1);
glVertexAttribPointer(attrib1, ...);
glEnableVertexAttribArray(attrib1);
//}

glBindVertexArray(0);

When you later bind the VAO to render something, it’s the same as calling the vertex attribute setup code (between the ‘{’ and ‘}’). Which buffer to read from and which shader attribute location to send the data to is stored in the VAO. That means that:

  1. You can’t reuse the same VAO for multiple (sets of) VBOs.
  2. You shouldn’t reuse VAOs between shaders unless you manually assign attribute locations so you can guarantee that the shaders have the same vertex attribute locations.

Performance increase? None, unless you’re switching VAOs a very high number of times per frame. The performance increase (if any) would also be entirely on the CPU side. All in all, I believe it’s better to focus on minimizing the number of VAO binds per frame, not the number of stored VAOs. Optimally each VAO should only be bound at most once per frame. Memory-wise, the data stored in a VAO is incredibly compact and shouldn’t use much more than one or two hundred bytes or so.

I think you missunderstood me which may be due to my bad english (or just me explaining poorly).
I’m perfectly aware that one vao has excatly one vbo and how the binding works. Let me try to explain it a bit more detailed:

Lets assume we have 1 vao with 1 vbo and 1 ibo attached to it. The mode of the vao is GL_TRIANGLES and the attributes are 3 vec4 (float).
Each frame we bind the vao once. We bind our shader program (and upload uniforms) and use glDrawElements to draw the vbo to the screen. Then we unbind the program and the vao and do other stuff waiting for the next frame. Until now everything should be pretty normal indexed vao rendering.

In code:


public void render() {
    vao.glBind();
    shaderProgram.glBind();
    shaderProgram.uploadUniforms();
    glDrawElements(GL_TRIANGLES, vao.size(), GL_UNSIGNED_INT, 0);
    shaderProgram.glUnbind();
    vao.glUnbind();
}

So here comes my “idea”:
The only thing set in stone here is the mode (GL_TRIANGLES) and the vertex attributes. Lets introduce “vao entries” represented by an object with specifies a data-section in the vbo and ibo (upper and lower bound). We render these entries using the offset in glDrawElements. It looks something like:


public void render() {
    vao.glBind();
    shaderProgram.glBind();
    shaderProgram.uploadUniforms();
    for(VAOEntry e : allVAOEntries) {
        glDrawElements(GL_TRIANGLES, e.upperBound - e.lowerBound, GL_UNSIGNED_INT, e.lowerBound);
    }
    shaderProgram.glUnbind();
    vao.glUnbind();
}

Now we render sections of the vao content separately using the same shader program. The last adjustment I make is giving each entry a separate shader program with the condition all shader programs use the same vertex attributes.


public void render() {
    vao.glBind();
    for(VAOEntry e : allVAOEntries) {
         e.glBindShaderProgram();
        e.uploadUniforms();
        glDrawElements(GL_TRIANGLES, e.upperBound - e.lowerBound, GL_UNSIGNED_INT, e.lowerBound);
        e.glUnbindShaderProgram();
    }
    vao.glUnbind();
}

Now I want to create only one VAO for the most common mode / vertex attrib combination and use entries and different shader programs to store completly different graphic-objects in these vao (but all sharing the same mode and vertex attribs).

That minimises the amount of vao drastically as I’m combine “to-draw-objects” in one vao over the “interface” of a common mode and vertex attribs.

Hopefully its more comprehensible now.

Okay, well, I did slightly misunderstand you then, but I still answered your question:

[quote]2. You shouldn’t reuse VAOs between shaders unless you manually assign attribute locations so you can guarantee that the shaders have the same vertex attribute locations.
[/quote]
As long as you do that it shouldn’t be hard to handle at all, but…

[quote]Performance increase? None, unless you’re switching VAOs a very high number of times per frame. The performance increase (if any) would also be entirely on the CPU side. All in all, I believe it’s better to focus on minimizing the number of VAO binds per frame, not the number of stored VAOs. Optimally each VAO should only be bound at most once per frame. Memory-wise, the data stored in a VAO is incredibly compact and shouldn’t use much more than one or two hundred bytes or so.
[/quote]
Another question is what other batching opportunities are lost by batching VAOs. By batching according to VAOs, you might have to bind shaders and textures a lot more, or submit more uniforms. I think you’re overthinking this a bit. Just go with the simplest approach and optimize it if it’s a bottleneck.

Okay thank you for your answer!

Just to plan for the “worst case”:
Lets say I’ve got ~200 vaos with each ~100 vetices. Do you think that amount of data and switching would be to much for the average system?

If you ever need that much then it means you’re probably doing something wrong.