HyperThreading vs Open GL

Hi,

Another microbenchmark i’ve done :

class HeavyTask implements Runnable {
// 10 millis task at least, crunching floats and arrays
}

after heating up hotspot server , processed 10 times this task using :

  • a direct loop with a call to the run() method
  • ThreadPoolExecutor (java.util.concurrent) with 2 threads or more.

results :
on my P4 with HT : 10 HeavyTask are executed between 120% and 200% faster than single threaded model depending of the type of computations.
on my old P3, from 80% to 110 % of the single threaded.

some thoughts :

It would be nice to take this extra power in OpenGL applications (with respect to its single threaded model), may be by writing something like that :

ExecutorService exec = Executors.newFixedThreadPool(2);

Callable callable1 = …// a heavy non gl task that returns a result for opengl
Callable callable2 = …// another task
Future f1 = exec.submit(callable1);
Future f2 = exec.submit(callable2);

// now inject results in opengl
Object computed1 = f1.get();
someGLProcessing(computed1); // callable2 in running in parallel
Object computed2 = f2.get();
anotherGLProcessing(computed2);

We could wrap up this method in a utility package to provide compatibility with non HT/multicore processors.

Wouldn’t it be nice if we could double the speed of non-gl work ? I am sure more and more time will be spent on the CPU (AI, ODE, …).

Any comments ?

Lilian

Wouldn’t it be nice if 98% of all the computers out there had more than one CPU, or an OS with decent realtime scheduling too though!

Or wouldn’t it be nice that the code was CPU bound in the first place :slight_smile:

And wouldn’t it be nice if it was worth all the extra hassle of writing correct multithreaded code just to get a bit more performance for some lousy game!

Ok I’m being sarcastic now :stuck_out_tongue: But we all know that two threads are better than one if you’ve got two CPUs - the question is, is it worth it?

Cas :slight_smile:

According to the survey on the other thread, 23% of their users have HT, and double core pentiums are out since last week… and if no game uses this extra power today, what will be the norm toworrow ?

The java.util.concurrent makes multithreading very easy, at least for scenegraph based games (for spaghetti games, i agree with you).

I’ll try to add it in by bubble racer prototype (very spaghetti, but hopefully small) to see what it looks on a real CPU bound case.

I’ll drop a test case later on today. stay tuned !

Lilian

[edited]

I won’t drop a test case, my code is too crappy to spend some time on… (lots of static variables are not thread- happy)

I shouldn’t have started it as a 4k game :frowning:

There is at least one game engine in Java that uses multiple threads and OpenGL already…

http://www.java-gaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=Announcements;action=display;num=1112491742

It shows the FPS of the two main threads, rendering and engine. The engine thread is capped at 62 FPS and does the majority of the processing. The renderer thread is free to run as fast as it can (unless vsync is forced by your video card).

The engine also supports running as a single thread so performance improvements can be measured. When running in the standard threaded mode, on a single CPU system (non HT), I consistently get 10-15% better performance then non-threaded.

I would love to see it run on a HT or dual processor system. My guess is it would absolutely smoke :slight_smile:

Why are games running faster on multiple threads vs a single one on a single CPU system?
The extra overhead to switch between threads and pass objects between them would definitely lower the overall framerate.

I can understand getting faster video framerate, but what about uncapped logic?
I honestly don’t see a gain on a single CPU.

Uncap the logic, benchmark it properly, and then I bet you a single CPU will be faster overall on a single thread.
Unless there is something more going on that I just can’t seem to see.

Hyperthreading is a weird old beast and not at all like having 2 CPUs… have a read of this Ars Technica on the matter. The important thing that this article, er, articulates is that trying to predict whether it’s going to help or hinder your application is probably impossible :confused: The only guaranteed thing that will happen is that your code will become rather more complicated and the number of actual systems with SMT or SMP boards is going to be a relatively small number for some time to come. And of course there’s the small issue that most of us can’t even max the CPU out anyway as they’re so fast these days; the bottleneck is still the graphics card.

Cas :slight_smile:

[quote]Why are games running faster on multiple threads vs a single one on a single CPU system?
[/quote]
The reason is because there is a great deal of bus activity when using OpenGL. Multiple threads are more efficient because you can gain some cycles while data is being pushed around.

CPU’s can switch thread contexts extremely fast. So fast you would probably never see an FPS drop even if the other thread wasn’t doing anything for you.

It is benchmarked properly or the game would not run properly when threading was turned off. It’s to complicated to explain how the engine works in this message, but there are no more waits that are inccurred in non-threaded mode. The game logic an timing runs identically with threads on or off, it just runs 10-15% faster with them on.

The thing you are missing is the raw amount of IO that has to happen in a 3D engine. Textures generated at run-time (the consoles and holo-display), thousands of verts, thousands of normals and colors being manipulated, all leads to MB’s of data going to the video card every frame.
The biggest thing to keep in mind as to why multithreaded can be benificial even on a single CPU system is because the graphics pipeline, no matter what amount of parallelism exists on the GPU, is still serial for state chanes and submission. No amount of NIO will change that, so there is always wasted cycles with just a single thread. The more you are sending to the card, the bigger the waste becomes.

As Cas said above, the bottleneck is the video card. This is why most games are better off with a high end video card and a mid-range processor as opposed to a mid ranged video card and a high end processor. With multiple threads you can take advantage of the lost CPU time to do something useful while waiting for the video card. The thing is, unless your engine is designed from the ground up to do this, there is no advantage to the strategy.

Developing a multithreaded engine is far more complex then a single threaded engine. The idea is to partition the logic between rendering and things like animation, AI, physics etc. The CPU intesive things (the later) can be done in another thread, so when data is being serialized down to the video card, the CPU can work on that other stuff for the next frame. This is the 10-15% gain in a single processor environment that I get. In a SMP or HT computer the gains will be even bigger…substantially bigger.

The whole starting premise here is likely faulty.

Intel hyperthreading is great at switching betwene threads that do nothing, which is what your benchamrk is.

It sucks at switching between threads doing real work because it basically dumps state at the drop of a hat.

HT is one of those over-promised/under-delivered sots of things that shwos very little value, from what we’ve seen, in the real world.

It works in the unlikely event that you have one thread crunching floats and another one crunching ints that are both using the same bit of memory :slight_smile:

Cas :slight_smile: