how can i tell if my JOGL code is being hardware accelerated?

theagentd · May 30, 2012, 6:38pm

… which is why Nvidia and AMD spend lots of money supporting game development. Ever seen “The Way It’s Meant To Be Played” and Nvidia’s logo when starting a game? That’s because Nvidia have helped them out, and obviously added some optimizations specific for their Geforce cards. Game makers obviously want to get even performance from both Nvidia and AMD cards, but those companies are competing and obviously want to look faster than the other.

matheus23 · May 30, 2012, 6:50pm

Youtube Vid

Cero · May 30, 2012, 6:50pm

I seriously hope all the OpenGL knowledge helps you get girls =D

Orangy_Tang · May 30, 2012, 7:22pm

matheus23:

theagentd:

ags1:

Great information there. I agree the GPU architecture is too complex to make simplistic comparisons. And that’s leaving aside the drivers which introduce another level of uncertainty. The only way would be sure how the app performs relative to architecture would be to run it on lots of different cards.

… which is why Nvidia and AMD spend lots of money supporting game development. Ever seen “The Way It’s Meant To Be Played” and Nvidia’s logo when starting a game? That’s because Nvidia have helped them out, and obviously added some optimizations specific for their Geforce cards. Game makers obviously want to get even performance from both Nvidia and AMD cards, but those companies are competing and obviously want to look faster than the other.

Youtube Vid

Bah! I wanted to reply with a video of the original 3dfx Glide start-up logo, but I can’t find it anywhere on the intertubes.

Damn kids, get off my lawn, etc. etc.

theagentd · May 30, 2012, 7:41pm

I’m working on it. I’ve been trying to write a Pascal exporter for blender just for this purpose, but exporting the bones was harder than expected…

http://img33.imageshack.us/img33/5560/mikucb.png

Danny02 · May 30, 2012, 9:04pm

yeah greate writeup theagent^^

while writing my bachelor thesis about CUDA I also learned a lot about how GPUs work.
And for every one who is interested in a more in depth view how the GPU works I really recommend to take a look at the CUDA documentation from NVIDIA which describes the architecture in a really easy way.

CyanPrime · May 30, 2012, 9:19pm

Easy way to figure this out is to download Fraps. If the fps counter shows up on your game then it’s hw accelerated.

gimbal · June 1, 2012, 11:27am

Yeah I have but one thing to say about these kind of things, but Family Guy is a good way to visualize it:

(couldn’t find a better version). That pretty much mimics games today. When does the darned game start!!!?

ags1 · June 1, 2012, 7:30pm

I have continued my experiments, this time trying to see the relative effects of retained mode versus immediate mode. I built a vertex buffer of 20k vertexes and drew it using either retained or immediate mode. I did it in a cycle increasing the number of times I drew the vertexes each cycle by the square of the iteration. To exclude pixel rendering as an issue I set the camera to be very distant from the scene (thus reducing the number of pixels being drawn).

Anyway, here are my results, if you’ve ever wondered if there is much difference between these two modes. (My loops terminate when the frame rate goes below 40).

20k Retained Immediate

1 : 274 244
4 : 246 193
9 : 227 166

16 : 179 128
25 : 138 93
36 : 110 65

49 : 93 49
64 : 78 39
81 : 64 -

100 : 55 -
121 : 47 -
144 : 40 -

Interestingly, I saw a much smaller difference when my vertex buffer was only 10k vertexes.

theagentd · June 2, 2012, 5:07am

What’s “retained mode”? glDrawArrays()?

The overhead of immediate mode can be offset by having a good CPU. A less balanced computer (or better balanced for gaming?) might have a better GPU and a worse CPU, so it might suffer a lot more from immediate mode (assuming you are comparing glBegin()/glEnd() to glDrawArrays()). Loading stuff into a vertex buffer can also be done on multiple cores, something that is impossible with immediate mode. All in all there’s no reason to use more CPU power than you need since your game most likely can make use of any spare cycles for AI, physics, more entities, etc.

Heh, I had an AMD Athlon dual core paired with a GTX 295 for a few months… ;D

ags1 · June 2, 2012, 3:26pm

Sorry, I’ve got my terminology a bit mixed up - I was using glDrawArrays for both modes. The only difference was whether the buffer was a VBO located on the graphics card or it was a FloatBuffer located in the PC’s RAM. I am sure if I did true retained mode with lots of calls to glBegin/End there would have been an even larger difference. It would be way too much bother to change my current test code to compare VBOs to glBegin/End code, although I will update the thread when I have a new test showing PC RAM buffers vs VBOs vs Display Lists vs glBegin/End. I’ve heard display lists are even faster than VBOs so it would interesting to test this out, and if I’m building display lists then I can easily test raw retained mode too.

My PC is definitely CPU limited - a Pentium Dual Core 1.8GHz matched with a GT430. The PC might be a bit slow, but I got it for 15 euros! I had to buy the graphics card separately and it cost three times as much as the PC.

I’m writing these tests because it is not immediately apparent whether this or that change in code results in a performance improvement. As a noob it is really easy to write something that looks OK but actually degrades the performance.

theagentd · June 2, 2012, 4:16pm

ags1:

Sorry, I’ve got my terminology a bit mixed up - I was using glDrawArrays for both modes. The only difference was whether the buffer was a VBO located on the graphics card or it was a FloatBuffer located in the PC’s RAM. I am sure if I did true retained mode with lots of calls to glBegin/End there would have been an even larger difference. It would be way too much bother to change my current test code to compare VBOs to glBegin/End code, although I will update the thread when I have a new test showing PC RAM buffers vs VBOs vs Display Lists vs glBegin/End. I’ve heard display lists are even faster than VBOs so it would interesting to test this out, and if I’m building display lists then I can easily test raw retained mode too.

My PC is definitely CPU limited - a Pentium Dual Core 1.8GHz matched with a GT430. The PC might be a bit slow, but I got it for 15 euros! I had to buy the graphics card separately and it cost three times as much as the PC.

I’m writing these tests because it is not immediately apparent whether this or that change in code results in a performance improvement. As a noob it is really easy to write something that looks OK but actually degrades the performance.

Ah, okay. There’s a pretty good reason why immediate mode and the gl******Pointer functions that take a Buffer object were removed in OpenGL 3.2 (or was it 3.1?). Even if you use VBOs and update them each frame it should be faster if you use glMapBuffer(). You should try that, it’s pretty simple. It should pretty much be the fastest way of doing it. Theroetically your program shouldn’t be bottlenecked by the memory transfer since it should happen in parallel to the rendering, though such an optimal case where the copy is 100% is of course rare…

Display lists were deprecated too! VBOs should be just as fast for rendering stuff, but I do agree that display lists have some use since you can also store state change commands, though that doesn’t mean that they are free or should be called more often than without display lists. Try it out, the driver is usually able to optimize the data a lot for display lists. A thing to note is that texture binds are NOT stored in the display list, though this might be a driver bug (not likely though) since it’s written in the OpenGL specs that they should be stored.

sproingie · June 3, 2012, 4:08pm

Technically there is no such thing as “retained mode” anymore. That was a feature from Iris GL, DirectX didn’t maintain it after its first public release (and dropped it in DX10) and OpenGL never had it. Obviously, using vertex arrays blurs the distinction somewhat, so most people know what you mean, but if you use the term in a DX forum, you’ll get a lot of sideways looks from people wondering why you’re using such an ancient deprecated API.

theagentd · June 3, 2012, 5:40pm

Ah, future misunderstanding(s) possibly avoided!