All these new n-core CPU systems...

Of course Java can, and will, use more than one core. Java has been running on SMP and (many, maaaany) more systems for many years. If you have two threads that do not max out CPU usage on a single core it is downright stupid to put them on different cores or processors. Threads share memory space which would have to be synchronized between different cores or even processors if they were to run seperatly. It would take up extra L1 and L2 cache space without even being able to use this succesfully.

It’s true you may miss some form of control with a JVM compared to hand coded assembly, but if you read some of the comments here :wink: you see why that might not be a bad idea. Luckily, with recent versions of Sun’s VM there are all kinds of clever tricks in the VM to optimize for threading, including some tricks that would be very hard to do tranparantly in C or C++ (such as thread local memory space for the exact problem described at the top of this post). Effective threading in Java, IMHO, is a lot easier than in some other languages.

It would be nice if you could discover the actual number of cores (virtual and none virtual) through Java, but I’m sure there will be some JNI libs that will support this.

Regarding this synchronization problem., isn’t it possible to use something like a DoubleBuffering when it comes to the logic-update being done?

So, if you have 2 threads, one doing render() as fast as possible, and 1 doing update() as fast as possible, then there is a chance that render might retrieve information from objects (to display them) that are being updated at that moment by the update() method.
Just as DoubleBuffering is used in graphics, can’t you have two buffers that contain the objects state, and when each update() is complete, the pointer to the buffer gets exchanged?

Of course, there is a risk that 1 render() takes longer time to do than 2 updates(), meaning that the buffer the render() is using will be invalid to use. Unless… if render() can say that bufferA is locked, that means the update() can only use bufferB. And when the render() is complete, then it releases the lock, and update() can now change bufferA. render() locks bufferB (which contains the newest information), and renders accordingly.

But, hm, that only solves part of the problem. When render() is complete, and wants to read from the next buffer, update() is using it. So, does render() continue to use the old buffer, or start using the incomplete buffer the update() is using?

So, does this require TRIPLE-BUFFERING? Let’s say you have BufferA, BufferB, BufferC

So that update() can lock-and-secure it’s own buffer, and render() can lock-and-secure it’s own buffer.
That means that, when update() is complete using BufferA, and sees that render() has BufferB locked, update() locks BufferC, and released the lock on BufferA. When render() is complete, it sees that BufferA is available and fresh, and locks it, and renders from it…until BufferC becomes open.

A pointer-swap flip is only possible with graphics because the buffers get completely redrawn every time. With object state you’d have to copy the whole lot each time, which is going to be a lot of data.

Also since you’ll effectivly be rendering a frame behind you’ll be adding latency to your game. That may or may not be acceptable.

True. I did think of this, but no solution in my mind.

[quote]It would be nice if you could discover the actual number of cores (virtual and none virtual) through Java, but I’m sure there will be some JNI libs that will support this.
[/quote]
In my testing on an Intel Centrino Duo:


Runtime run = Runtime.getRuntime ();
System.out.println ("Available Processors: "+run.availableProcessors ());
--
Available Processors: 2

On endolf’s point:

[quote]Why not let the OS do this?
[/quote]
I saw quite a few posts on the Java Technology forums that said this was the way to go - change the priority of your game’s threads to be higher than normal and if the OS sees fit, you’ll get one core per thread (e.g. 4 threads, 4 cores)
I’d like to assume it’ll work this way.

In some far distant land called Utopia, then if you had 4 cores, it would be nice that you could dedicate 3 cores to 3 threads of your game, and let the 1 core that is left to do whatever else computing that needs to be done by other applications.

core 1 - update
core 2 - render
core 3 - networking/keyinput/physics
core 4 - anything else

Instead of creating 3 threads and hoping they get run on multiple cores, but not just 1 core that context-switches them, or 2 cores. This is, IMO, too big of a question to just ignore this. I admit, I don’t know much about this, haven’t done any testing, so I cannot claim or state anything. I’m just worried about the unknown-factor, I don’t know!

Different OS’s have different algorithms regarding prioritization on threads, in regard of context-switching them. I wonder how Java VM gets around this, if it does.

My advice is to try and do everything inside some sort of Runnable class; then have N worker threads that work on those runnables until the queue is flushed; on which you would render.

Also compartmentalise! And then compartmentalise some more…and multithread those dam for loops…

if you have a classic:


for (int i = 0, n = someArray.length; i < n; i++) {
  MyObject obj = someArray[i];
  ...
}

Do it like so:


for (int i = 0, n = someArray.length/2; i < n; i++) {
  MyObject obj = someArray[i];
  ...
}
for (int i = someArray.length/2, n = someArray.length; i < n; i++) {
  MyObject obj = someArray[i];
  ...
}

And put those two for loops in different runnables and send it down that nice queue for yours…Some useful situations:

a) Silhouette detection in shadow volumes
b) broad phase collison detection
c) CPU based skeletal animation on async models

and the list can go on and on; the most important thing to remember is to minimise the interactions between those runnables. For example, pathfinding would probably be a bad choice to multithread since some paths might need to go around another object if you take time and velocity of movement into consideration when calculating your paths (i.e. future collisions down the path).

Thats what I do anyways :slight_smile:

DP

garbage collector is more complex then your picturing it to be

http://java.sun.com/performance/reference/whitepapers/5.0_performance.html#4.1.2
http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html
couldn’t find it in 1…2…3 but I thought the GC was pluggable alltogetter these days could be confusing it with other stuff.
http://www.nljug.org/pages/events/content/jspring_2006/sessions/00021/
was a session I attended earlier this year wasn’t really related to why I was there but it was interesting to watch although Real time and games aren’t really one on one, glitches though GC doing its job is something that ppl want to avoind in Realtime apps and in games.

that was kinda (really) offtopic.

anyways,
I don’t know to what extra use the knowlage of the amount of logical processors will do. the meganism that would be able to optimise against those x cores could probebly be made gentric adding some meta data to your thread if such a optimilisation would be effective at all. (that and some other if’s)

What is the point of having more than 2 cores?

Having 2 cores enables you to have all your background stuff running on one core and whatever your main application is on the other. That’s useful. Having more than that seems like a waste. What if I’m running one really computation-intensive thread? Then all the cores but 1 are wasted.

I would much rather have a 4 ghz processor than a processor with 4 1 ghz cores. A processor with 2 2 ghz cores would be all right though.

Appearently, the Intel folks don’t agree with you :slight_smile: They’re betting all their money on multicore systems. Unicore systems are, as of the year 2006, appearently a thing of the past. It’s a lot easier to sell a machine to customers that is multicore than unicore; “You’re getting 4x the product compared to that 1x old product.”
Besides, 4ghz seems to be the top-speed they’re able to get the CPUs to run at.

I guess we, programmers, need to change. We need to learn to analyze our problems in more ways now, being able to split it into more independent units that can be run parallel.

Does Java not provide any method to retrieve some information about the computers hardware? Like; video card, cpu load, etc. ?

dedicating a core to do a single type of work is basically bad design - and I am pretty sure that alan wake doesn’t do that either, despite the presentation. They’re more likely dedicating a thread and let the OS decide the core its running on.

The only reason to let a thread only run on one core (setting affinity) is that you want to avoid core switches, which might incur a high cost (relatively). However scheduling a thread on 1 core, might mean that some of the integer or floating point operations stall becase the single core isn’t fast enough.

If I ever get near maxing out a CPU with game logic I’ll be sure to investigate the horrible complexities of multicore game programming :slight_smile: Until then it’s all just pipe dreams!

Cas :slight_smile:

ghz or better: frequency goes up the power consumption goes up exponantially, multicore allows for more performace while keeping the power consumption descrete. now you might be like I don’t care too much about my power bill but the heat that comes with it also has to be moved hence the huge fans you see today. offcourse performance wise there is also the whole how-much-gets-done-in-a-clockcycle bit.

multicore scales better then ghz’s

I don’t really deal with hardware this is all 2nd hand but I think it’s pritty accurate.

Say you’re doing collision-detection, (not collision-response), in really complex and high-tri-count scenes. Why limit yourself to 1 thread handling the game-logic, when you have 4 available. There is so much to gain, with so little effort, it would be a waste not to take advantage of it.

We all rather have 64GHz cores, just like we’d rather have flying cars and holographics harddisks, but hey, it’s not going to happen anytime soon, so lets enjoy the 32x 2GHz, AI controlled cars and hybrid harddisks for a while :wink:

Basically any Java program will be able to run faster on a multi-core processor. (If they actually will is a question of how well implemented the JVM is to take advantage of the CPU cores). This is because in Java concurrency is inherent. Some of Java’s appearant slowlyness is because it has made use of concurrency which hasn’t been well executed on a single-core processor. But once multi-core is in place the obstacle is removed and many programs will run much faster.

So in summary multicore is good for Java. It’s good even if you don’t do anything, but it’s even better if you adapt for concurrency. Any programmer interested in this should buy a copy of Java Concurrency in Practice by Brian Goetz and others. It’s already a classic.

Ok, I agree, maybe it’s time to read a book on this subject before babbling more.

Just because you can run something broken up across multiple threads doesn’t always mean its a good idea, or even faster. In fact, a lot of basic things are a lot slower when broken out to be parallel, and a lot of parallel things can run slower then just a linear approach at times, sorts are famous for this.

Yes, in general multiple cores are great for Java, and yes java supports it pretty well. But just becareful on your design and learn how to deal with multithreading & concurrency before going nuts trying to use it all over the place. Bad things can happen. :slight_smile:

Why would sorting the same data on 2 threads be bad? First you do a rough sort, dividing your dataset in low/high, and then process the 2 segments on 2 threads.

Or you could use merge sort, which is basically the same idea and can be massively parallel.

Unfortunately for most uses the overhead of thread creation and context switching is likely to eat up any gain you’d get (and possibly more).

Tim Sweeney of Unreal fame did a pretty interesting presentation about the whole multithreaded game engine issue, separating the engine into sections of increasing inherent parallelism (Simulation, Numeric computation (e.g. physics), and Shading (rendering)) and suggesting how to do each in parallel with reasonable speedup. I was especially intrigued by the atomic threading primitive, which would replace regular and hard-to-get-right locking primitives in game logic code.

  • elias