Multi-threading and Collision Detection

philfrei · March 9, 2012, 4:41am

I’ve been learning about concurrency and Actors, and just got a test program working.

http://www.hexara.com/pond.html

http://www.hexara.com/Images/LotsOfThreads.JPG

On this program, there are 576 threads (one for each of the squares) running independently. Each with the following basic structure:

While(true)
{  
    
   move();
   Thread.sleep(sleepAmt);
}

I made the sleep amounts different for the different threads. Instead of a game loop, there is a timer() that triggers a redraw every 25 msec, taking a “snapshot” of the current state.

I’m using an array that matches the display, one element per pixel, for collision detection. I’ve never done collision detection this way before, by locking and consulting elements that correspond to the pixels. (These squares are 8x8 pixels, so there are 64 “drips” in the “pond” that are locked and inspected with each move.) But it seems to make a certain philosophical sense. Space itself is the medium by which we determine what is close and what we can occupy or not.

This structure can also be thought of as a sort of “message passing” medium for the “Actors”. When they probe a position, they can find out whether and what other Actor is there, and response behavior can be defined for the situation. For example, in the case of collision, compute a bounce. Or maybe reds can eat oranges or blob together with greens. (One could certainly animate sprites in this context.)

I’m used to reading about things like QuadTrees or Bins. In those cases, one is consults and checks against lists of other objects. Here, one is consulting the space itself. Is this sort of collision detection done in games? (I’m okay with recreating wheels, as a learning experience.) Examples?

I’m also intrigued about the possibilities for designing games or controls over hundreds of independently threaded objects, and am wondering if there are other games to look at as models. I don’t know. It just seemed kind of cool and I wanted to show it off and see what others thought.

StumpyStrust · March 9, 2012, 4:48am

576 threads? that just sounds bad. From what I understand, one thread per core is best. 576 means lots of context switches which eats at performance.

Roquen · March 9, 2012, 7:13am

For actors you need to use pico-threads…not real OS threads.

StumpyStrust · March 9, 2012, 8:11am

very interesting…just looked up pico-threads and well…

Roquen · March 9, 2012, 8:38am

Better yet look at one or both of these frameworks:

http://www.malhar.net/sriram/kilim/
http://akka.io/

For real threads: I use the rough rule of thumb that I want somewhere between 1.5-2.5 logically active threads per core. By logically active I mean that within some time window that number threads are attempting to run. If you only have one per core, then if it gets stalled then the scheduler has nothing to do and that core sits idle until the stall is resolved.

theagentd · March 9, 2012, 8:49am

Hyperthreading improves this even further. It’s a myth that hyperthreading doesn’t help games at all, since properly multithreaded games can sometime benefit more from Hyperthreading than from additional cores in memory bottlenecked situations. This is true for my multithreaded CPU particle simulation as well as for DICE’s newer Battlefield games.

As Roquen said, it’s better to have more than one thread per core, and I can squeeze slightly more performance out of my particle simulation if I set it to 8 threads instead of 4 (I have a dual core CPU with HT, so 4 logical cores).

philfrei · March 9, 2012, 7:53pm

I’d be interested in seeing the links, @theagentd!

Thanks to all for the various references.

I have to admit, while the discussion about optimum threads per core has its benefits, I’m not particularly interested at this stage. I’m embarking on learning about how to effectively use “Actors” or other concurrency-related patterns, and am more interested in what might be derived from them. Things like particle-systems, or molecules glomming together or splitting, or ant/bee colonies, or crystallization modelling–these are some of the things that come to mind. And I wonder about how some of this might be put to game use. Maybe that will lead to something fresh and new, as opposed to remaking [fill in the blank].

Having hundreds more threads than cores is not going to be optimal, but neither is a single game loop thread. It seems interesting to me to get out from under the “tyranny” of game loops, and look at other possible forms of organization.

Running AI opponents seems like another fruitful avenue to explore. I’m just exploring.

Having independent threads seems to demand having a “different” sort of basis for collision detection. Traditional 'all-at-once" use of bins/quadtrees/etc. doesn’t seem as effective. Am I wrong? I’d love to hear about it.

sproingie · March 9, 2012, 9:32pm

Actors communicate by exchanging messages with each other in a system where each Actor has its own private message queue. If Thread A talking to thread B will erase the message of thread C talking to thread D, then it’s not an actor system. It would be interesting to see how well Akka STM would scale up for the shared map though.

As for actor performance vs “omniscient” spatial tree algorithms, I suspect the tree is going to win every time. It’s a wonderful organizational tool, but there’s always going to be a lot more overhead than an algorithm that understands the entire system at once. This isn’t to say you couldn’t take a hybrid approach, however and use both.

theagentd · March 10, 2012, 11:21am

That’s more about particle stuff, but it does contain some screenshots and lots of info… xd

Thread to thread communication is inherently expensive, so it’s a very bad idea to have lots of it, ESPECIALLY if the amount of synchronization increases with the amount of processing needed. Having one synchronization per object is an incredibly bad idea, since as the number of objects increases parallelism decreases, which is the exact opposite of what it should be. I’ve found that the simplest way of doing things is to split up the game into individual tasks. With some dependency information you can easily find out which tasks can be run in parallel to each other and simply run things that don’t affect each other on different threads. This all sounds easier than it is of course. It’s very easy to end up with one huge task and a few small tasks, which obviously runs single-threaded 99% of the time. Due to this it’s therefore a good idea to split up large tasks into smaller independent tasks. This sounds harder than it is, but the thing is that large tasks are usually large because they do the same procedure repeatedly on several objects, for example updating objects, updating particles, e.t.c. These tasks are VERY easy to split up as long as you make the update functions order-independent and thread safe, but this is how it should be in the first place! It doesn’t matter which particle I move first since they only collide with static geometry. For more advanced things like collision you can split it up into two passes. The first pass checks for collisions between objects and calculates a force to apply to the object. The second pass applies this force to each object. Both passes can be split up into smaller tasks.

For the game I’m making now (an RTS) I’ve prepared everything for multithreading. The heaviest parts of my game are visibility testing (a ridiculous amount of 2D ray-casting) and collision detection / “physics”. Visibility testing just queries a grid for the objects in the vicinity of the unit and ray-casts to them to check for obstacles, and then adds the unit to a list of visible objects from the perspective of that specific unit. This can be done independent of the order of testing, and is inherently thread safe since it doesn’t write to any shared resources. It’s therefore extremely simple to just update every other unit in 2 or more threads.

Physics is a lot more tricky to get working, but my game only has very simple physics, so that helps a bit. In the first pass I query the immediate vicinity of each unit for other objects and then check for collisions with these objects. If a collision is found a force is calculated and added to the total force that will be applied to the unit. Note that the force isn’t actually applied to the unit yet. This is done in a second pass, where the force is applied and then the unit is updated based on velocity. I then needed a third pass to update the grid with the new unit positions. This was actually done single-threaded since the order of units per grid tile needs to be deterministic, but I try to keep all threads busy with other tasks during this pass, which also happens to be very fast (something like 1000 variable comparisons with something like 1% actually needing to be updated).

In the end I made a small library for this kind of parallelism. It can be found here: http://www.java-gaming.org/topics/small-java-threading-library/25143/view.html If you have any questions about it just ask! =S

DrZoidberg · March 13, 2012, 12:26am

An actor system is actually similar to a game loop with a few differences. The actors are automatically moved to one of several worker threads. Actors are always designed to be completely thread safe. So they don’t modify any global state. And they don’t modify the state of other actors directly. This is also called “side effect free”. If all your methods are side effect free, then your code is automatically thread safe and that allows the actor scheduler to move your actors to whatever thread it wants. You could look into functional programming to learn more about side effect free multithreading. Functional programming and actors go well together.
http://akka.io/docs/akka/snapshot/index.html

philfrei · March 13, 2012, 10:01pm

@DrZoidberg - AKKA looks very interesting. Thanks for the link to the documentation. This link might work better than the one posted in the email above which failed for me:
http://doc.akka.io/docs/akka/snapshot/

I am working to make the individual threads “thread-safe”. Interesting, when trying to add things like “bouncing” or “temperature”. I have clumsy implementations of each now. Maybe will make a “beehive” and “flower bed” and a couple “nose holes” to check the smell of each based on Actor state. Also want to try a “fireflies” version, and eventually work out a way for them to sense each other and get into synch.

Mostly, though, this is to kind of make it real–reading about concurrency is really difficult in the abstract. I really have to have a hands-on project in order to understand Doug Lea’s or Brian Goetz’s books.

A bunch of Actors messaging into a “Disruptor” is another idea I’m thinking of trying to understand better, too, rather than contending (using synchronized) for the array that maps to the display area.

ra4king · March 13, 2012, 10:38pm

I can confirm this here as well, I’ve got an i7 2600k (hyperthreaded quadcore) and your multi-threaded particle test ran much faster with 8 threads rather than 4. It even ran faster with 16 threads!