Floating Point Determinism in Java

So I just discovered yesterday that JBox2d turns out to be wildly non-deterministic, as in, if you run the same physics simulation 10 times in a row (whether or not you’re re-starting the entire program or not), you can very easily end up with 10 completely different results, even though locally all the physics seems perfectly normal. This seems to show up more often with very complex simulations, which is not terribly surprising, chaos and all that meaning that smaller changes are magnified more with more complex systems.

Now, this was surprising to me, to say the least, because every line of code in the engine seems like it should be deterministic, as long as you’re on the same machine with the same setup - there’s not a Math.random() call in the entire thing (apart from testing code, at least), and I’ve even tested with everything set so that all Math.* calls are instead handled through lookup tables instead of through the Math library. Every other operation is straightforward pure Java, almost no library calls, and plain old arithmetic. Every container iteration (to the extent that we use containers at all, which is very little, most things are arrays) is indexed, so there should be no question about iteration order changing. Single threaded operation, of course.

I’m aware that you can never rely on floating point consistency across processors or OSes, but this was surprising to me. Results seem to keep changing even after substantial warmup, once all function compilations have stopped.

Any ideas what might be going on? Has anyone ever noticed any non-obvious non-deterministic actions in Java? I realize I could probably tag everything with strictfp and ensure floating point consistency, but it just seems like I must be missing something, if no methods are being recompiled and I’m initializing everything in the same way, then running the same simulation, how can I possibly be getting different floating point results?

Is it possible that the compiled code the JVM is producing actually leaves some operation orderings up to the CPU or something like that? In C++, typically you can at least rely on the same program to produce the same output on the same computer, but maybe the JVM is being more clever in some way?

I would suspect, that the “problem” lies in non-deterministic delta times in the simulation stepping. System.currentTimeMillis() “hops around” a bit…

I don’t use JBox2D, but there has to be some sort of mathematical integration. Eg, Euler integration. As cylab mentioned, fix the time step or record the deltas and use those to play back the second time if you want the same result.

I’ve been bitten with unexpectedly-different results from what (I thought) should have been deterministic code before - it turned out to be down to iterating over the contents of a hashmap. I hadn’t overridden hashCode(), so the order of iteration was determined by the object’s location in memory, which obviously changes from run to run. Using a LinkedHashMap enforced the iteration order and everything was rosy.

It doesn’t sound like this would apply to you though. If you’re already using a fixed timestep then there’s something very strange going on…

Not really related to non-determinism, but I found that using look-up tables for Math operations eg sin cos tan can throw out the numbers by a huge amount even though the errors in the calcs start very small

Also, doesn’t the engine use slop factors in some calculations? Maybe they’re random?

Table based methods are very very bad for many simulation techniques which frequently depend on the properties of the function(s) in question. As an example, sinusoids are assumed to be continuous everywhere and a table based version is discontinuous at each grid position. Also note that table based methods have huge relative error to begin with.

See my old thread:

http://www.java-gaming.org/index.php/topic,18278.0.html

Doing exactly the same a few times, can result in different values. (or my CPU is b0rked)

Riven is it still like that? I didn’t want to bump an old thread. But on all my systems I get exactly the same answer. Bit perfect. I am on 64 bit linux and i tried with jvms from 1.5 and up (including 1.7).

At first it didn’t cus I didn’t use a new instance of each Runnable for each thread… but that was it.

Couldn’t reproduce on my brand new work PC. Will try @ home later.

Nope, uses fixed time steps (always recommended for physics, FWIW). Also not a hash map in sight, I’ve been bitten by that one before, but the one hash map we use is manually managed and completely deterministic.

The slop factors should always be deterministic, they don’t randomize at all, what those do is allow for overlap tolerances.

Tried Riven’s code, seems to do fine on my computer.

Yes, I actually usually don’t like using lookups because the speed improvements are not great anyways; as it turns out, trig functions actually aren’t that important in the physics engine, they’re so infrequently used that it’s kind of pointless to optimize over them.

Still, though, shouldn’t be causing non-determinism (also, I’ve tried using Math.* instead, results are the same).

I’m going to try to dig into this deeper, see if I can track down the first place that things go wrong. Maybe I’m wrong, and I actually do have some indeterminate code somewhere in there that I just don’t remember…

Try running in interpreted mode. It’ll help eliminate a few possibilities if you still see the same bad behavior. Mixed mode execution. Also the lack of strictfp becomes very unlikely.

I remember this guy having setup a deterministic jbox2d simulation:
http://zetsgames.com/deterministic-physics/

Basically he added strictfp to all classes

strictfp should be required. But traditionally java has defaulted to strictfp despite the performance penalty. IIRC On x86 cpus the internal FP accumulator/register is quite a bit bigger than 64 bits (Or is it only for intermediate results in the FPU?). This changes rounding behavior from IEEE quite a bit, but most importantly, if the result is “pushed” back to ram and reloaded, it is now rounded like a 64bit IEEE number. Thus JIT could really change things here.

One thing i never did work out is what happens at a task switch?

On newer processors you also have stuff like fused multiply add instructions (FMA).

A context switch (re)stores the full register set state (SEE: FXSAVE & FXRSTOR).

I’m pretty sure that the JRE has always changed x87 to 64-bit mode, so you should never have the 80-bit problem and so doubles would always be strictfp (not so for floats). This is pretty much moot with SSE processors (except for the noted FMA)

I understand these points, but what I don’t really understand is, can any of this behavior change without recompiling the code? I would have thought that behavior like what’s kept in registers and what goes to RAM is coming in at the assembly level, not lower, right?

As suspected, it sounds like strictfp probably is the answer, based on thijs’s link it’s actually worked with JBox2d before, which I didn’t know anyone had actually tried - the interesting bit here is that if this works, it’s actually a lot easier to get cross-platform consistency in Java than in any other language (including C++, where it’s a whole lot of work, and AS3, where as far as I can tell it’s absolutely impossible w/o fixed point math) thanks to that flag. I may try that out just to see for sure that it works, the question comes up a lot (though not enough to use strictfp in the library by default).

You’d have to add this modifier to every field and method… you’ll probably be better off with an ASM ClassAdapter to produce a new JAR.

Yeah. That’s why I’m suggesting running in interpreted mode first to verify that the problem ‘might’ be intermediate value related.

As i understand you should only have to add strictfp to the class. Not the fields or methods etc…

Yes, I thought so too.

It would be cool if jbox2d supported deterministic simulation as an option, fx like Riven suggest through an manipulation framework like ASM (http://asm.ow2.org/)
Please keep us updated on your findings… Im particularly interested as I’ll have a go on a p2p networked physics simulation in the near future.