Renderer Optimization


Are there some tips how to optimize an renderer?

For example:

  • Dont create new objects in renderer loop (because of GC)
  • Dont use too much shaders

and so on.

Your question is not precise enough. A Renderer is nothing else than a bunch of lines of code. So in terms of “optimization” you have to ask the same questions as for any other module: What do you want to optimize?

If you’re talking about performance optimization, there are also some preconditions, that determine how to optimize your renderer: Which platform/s are you targeting? Which kind of game or application do you need to render? Which or how many graphics APIs are you targeting?

If you want to decraese the duration of your “draw” call, I recommend to measure CPU and GPU timings and reduce them step-by-step. If you’re CPU bound, you most likely have to remove ressource initializations to some kind of init phase, if you’re GPU bound, you have to take a closer look at the graphics API you’re using and how you can accelerate your calls.

Rendering can be bottlenecked by different things. The key is to optimize the bottleneck or you won’t actually see a performance increase.

CPU optimizations:

Avoid garbage generation in your entire game. Routine operations should not generate garbage or you will get regular stuttering.

If your bottleneck is your game logic, then the rendering performance isn’t very relevant to increasing performance, so focus on optimizing the game logic in that case.

The OpenGL driver has quite a bit of CPU overhead. The cost of a draw call (glDrawArrays/Elements() and their variations) is proportional to the OpenGL state you changed since the previous draw call. An FBO bind is very expensive, shader binds are pretty expensive, while texture, uniform and VAO binds are cheap. The frequency of said binds should be inversely proportional to the cost of them: You usually just bind a handful of FBOs each frame, you may have a few shaders used for each FBO, and you could be using a crapload of textures, uniforms and VAO binds for each shader.

Batch draw calls together. Each draw call has a cost, so reducing the number of draw calls (especially reducing them to a constant number regardless of the number of objects) is an extremely powerful optimization.

If your CPU performance is the bottleneck, consider offloading work to the GPU. Better balance = better overall performance. This is especially useful for 2D games as they generally have very low GPU load and very high CPU load. For example, instead of drawing a tilemap as quads (2 triangles each), you can draw a huge quad with a shader that chooses a texture for each tile. This requires less work for the CPU by having the GPU do slightly more work.

Precompute things that you can precompute, for example the vertex data of tile maps and voxel worlds. If you need to dynamically update stuff, chunk together data so you only need to regenerate the affected chunks when a change happens (instead of the entire world).

GPU optimizations:

The GPU has lots of dedicated hardware that can bottleneck your game. Figuring out where the bottleneck lies is key to improving GPU performance. Use OpenGL timer queries to figure out how much time each operation takes instead of playing guessing games.

Using shaders is not slow. GPUs nowadays emulate all fixed functionality features with shaders anyway, so it’s not like they’re “not used”.

If applicable, make sure you use indexed rendering. Indexed rendering gives the GPU a change to reuse vertices instead of running the vertex shader for the same vertex twice, which can save you a lot of performance. For example, in a simple grid each vertex is reused an average of 5 times.

If your fragment shader simply reads a single texture and multiplies together some colors, you’re most likely bottlenecked by the ROPs (which write the result of the shader to the FBO), so don’t be afraid to use more complicated shaders if it can save you work somewhere else. Adding more code to the fragment shader is free up to a certain point.

If you gave us more information on your general use case, then we could give you more specific tips. At this point, the only real tips we can give you is 1. find bottleneck and 2. fix bottleneck.