I’m trying to understand opengl a little better, and have come across 2 short questions. I’m using lwjgl, if it’s relevant.
- When writing glsl shaders, should I put parenthesis? I’ve read matrixmatrix multiplication is slower than matrixvector multiplication, e.g.
(matrix1 * matrix2) * vector1
is slower than
matrix1 * (matrix2 * vector1)
Specifically, my vertex shader has this line, where projection, view, and model are mat4’s, and position is a vec3:
gl_Position = projection * view * model * vec4(position, 1.0);
and I’m not sure if I should let glsl decide how it wants to do this, or if I should put parenthesis to ensure optimization of multiplication order.
- I’m using glDrawElementsInstanced to draw a bunch of cubes (~3,000,000). I’ve divided a world into chunks, with the cubes evenly distributed between the chunks. Each chunk has it’s own VAO and does a draw call. For optimization, I thought if I were to aggregate the cubes from all chunks into 1 VAO and reduce to 1 draw call, then maybe I would have performance increase.
I setup an example with 4 chunks and cubes evenly distributed between the chunks (so ~750,000 cubes per chunk). But performance either decreased by ~1fps or didn’t change when I use an aggregated VAO and 1 draw call as opposed to 4. I’m assuming 4 draw calls and VAO’s simply don’t have enough overhead to be measurable, and I’ll try with more.
- update: I’ve tried with more chunks. For this setup, I had ~2,100,000 cubes evenly distributed over 1024 chunks. Having 1024 VAO’s and draw calls, gave ~30 fps. Having 1 aggregate VAO and 1 draw call gave ~29 fps.
Similarly, ~4,100,000 cubes distributed over 2000 chunks gave 15fps with and without the aggregated VAO.
Are there downsides with draw calls of too many instances?
Are there any downside with a couple thousand draw calls per frame?
All these tests were done with static cubes. Are there any downsides that would become more apparent with dynamic cubes?