Normal for glClearColor and glClear to be slower than the actual rendering?

I recently profiled my game (using LWJGL) using the NetBeans profiler and to my surprise, the initial pixel clear function was taking LONGER to execute than rendering 10 thousand sprites a frame.

This really baffles me and I wanted to ask if anyone has any idea why?

Its not just a slight difference either, the glClear and glClearColor methods are taking over 2x the time that doRender() is, and doRender actually renders about 10k-20k sprites per frame, setting up buffers, binding shaders, calculating tex coords and finally calling glDrawElements.

Now i’m just super confused and I’d like to see if anyone has any clues as to why this is.

The function looks like this:

@Override
    public void begin() {
        GL11.glClearColor(0, 0, 0, 1);
        GL11.glClear(GL11.GL_COLOR_BUFFER_BIT);
        a = r = g = b = (byte)0xff;
        scissorRect = null;
    }

Couple of possibilities spring to mind:

  1. It’s possible that GL is queuing the commands and only flushing when glClear() gets called - unlikely, though I have seen other odd ‘bottle-necks’ when profiling before. Might be worth adding glFlush() at the end of the rendering code? (or would swapping the buffers force the flush anyway?)

  2. Are you using vsync? It might just be waiting for the sync before starting the clear?

  • stride
  1. Tried that, it makes no difference, and yes I do call Display.update() every single time after I’m done rendering. (Buffer swap)

  2. No, I’m not using vsync. I still get over 200 FPS with this code but I just thought this anomaly was kind of strange, like it shouldn’t be happening.

Agreed it’s very odd, bit of googling reveals a few other people have had vaguely similar issues but there doesn’t appear to be any consensus.
Maybe some other forumistas can suggest why this is happening?

Throw a glFinish in to see if that does anything to your profiling.

Chances are it’s just queueing commands, and only executing them on glClear. Nothing to worry about if you’re getting 200 FPS.

Also; you don’t need to set the clear color every frame. Since GL is state-based, you can just set it once during initialization.

Thanks for the tip.

Surprisingly, I changed my glClearColor function to this:

public static void glClearColor(float cA, float cR, float cG, float cB) {
        if (clearA != cA || clearR != cR || cG != clearG || cB != clearB) {
            clearA = cA;
            clearR = cR;
            clearG = cG;
            clearB = cB;
            GL11.glClearColor(cA, cR, cG, cB);
        }
    }

And now the glClear() function is taking about as long to execute as the rendering step,

I still don’t understand how glClearColor() could be such a bottleneck though? Seems super strange to me.

Did your actual FPS change?

Measuring GPU performance using CPU benchmarking won’t give accurate results. I’ve chased ghosts quite a few times simply because the driver’s command queue got full causing a random OpenGL command to block. You’re calling glClear() right after Display.update(), so it’s very possible that the driver simply blocks on the first OpenGL call(s) because it has buffered enough commands (1-3 frames ahead usually). There’s actually no way to know for sure what’s happening, since it’s completely up to the driver.

You could try to measure the actual GPU time it takes to clear the screen using


int myQuery = glGenQueries();

...

glBeginQuery(GL_TIME_ELAPSED, myQuery);
//Clear screen
glEndQuery(GL_TIME_ELAPSED);

Then you can get the time it took to clear the screen in nano seconds using [icode]glGetQueryObjecti(myQuery, GL_QUERY_RESULT)[/icode]. Note that getting the results right after calling glEndQuery() may cause performance problems since the CPU has to wait until the result is ready. I recommend that you get the results with a whole frame’s delay, meaning you’ll need two sets of query objects.

  1. use query object 1 when rendering frame 1
  2. use query object 2 when rendering frame 2
  3. get result from frame 1
  4. use query object 1 when rendering frame 3
  5. get result from frame 2
    and so on.

Gxu1994 - Well, it’s not a “bottleneck” since you’re at 200 FPS. It might just be slower in comparison to everything else (especially if the other stuff is working on the GPU).

As others have already hinted, all OpenGL calls are asynchronous at first. Some calls may block under certain circumstances which may also differ from implementation to implementation.

Ah yeah, I’ll give all that a try.

Thanks everyone! :slight_smile: