Performance problems with glTexImage2D

In JEmu2, the following line is called only once a frame. It uploads the emulated backbuffer to an OpenGL texture which is then later displayed.


		GL11.glTexImage2D(
			GL11.GL_TEXTURE_2D,
			0,
			GL11.GL_RGBA,
			texWidth,
			texHeight,
			0,
			GL11.GL_RGBA,
			GL11.GL_UNSIGNED_BYTE,
			buffer);

The entire emulator spends almost 40% of the time in this call (this is on an IBM laptop with ATI video card), so using this technique can be dog slow. While I can imagine this to be not very fast, this is just unacceptable and sure seems like a performance bug.
Is there some trick I don’t know about that’s bound to be faster than this?

I have a few options:

  • Revert to java2D - not preferrable given performance problems on some platforms, fullscreen instability on some platforms, no control over VSync, no hw accellerated filtering, timing problems on pre java5 etc.
  • Try SDL - I don’t have any experience with SDL, but it’s the default x-platform rendering API for 2D so worth checking out.
  • Try J3D with D3D renderer on windows/OGL renderer on other platforms - Might work, but adds a lot of overhead.
  • Convert all rendering in JEmu2 to OpenGL (currently everything is software rendered, except the final image) - Might not work, because then this line will be called many times in some games, although with less data per texture. Might lead to more incompatibility because I’ll need more OpenGL features.
  • Ignore this performance problem :stuck_out_tongue:

But I hope I don’t have to go to any of these alternatives because I’m just doing something wrong :slight_smile:

Any thoughts?

What’s the width/height of the texture?

As the true resolution of those games is something like 320x240, I guess you are uploading a 512x256 texture? If you are using NPOT dimensions (on ATi) that could be another performance hit.

Uploading a texture that is 1024x1024 RGBA takes 10ms on my system. 512x256 should take roughly 1-2ms then.

Note: don’t try to resort to RGB instead of RGBA, because that is 20x slower on ATi.
(search this forum for the benchmarks on various systems)

Besides all that:
What’s the use of OpenGL and hefty stuff like hardware-acceleration, if you just want to blit a custom pixel buffer?

And then I reread your post :slight_smile:

One more thing… 40% cpu time on that method can also mean that everything else in your emulator is lightning fast.

It’s more usefull to measure how long it takes each frame.

You should be using glTexSubImage2D not glTexImage2D. Bingo, performance problems gone. Unless you get the pixel formats wrong and it has to go through a conversion.

Cas :slight_smile:

Thanks, I’ll look into that.

Yay, it went from 40% down to 4% when I changed to glTexSubImage2D! :smiley:

:o Good to know.

Yes, I never would have expected the difference to be this large.

glTexImage2D has to deallocate then reallocate new texture RAM every time it is called. That’s why all the advice plastered all over the internet says to use glTexSubImage2D once you have created your initial texture :stuck_out_tongue: I’d call you a n00b except JEmu is so amazing hehe :wink:

Cas :slight_smile:

But I AM a n00b! :smiley:
(when it comes to OpenGL anyway)