[JOODE] Contribution

Since everybody seems to be adding little additions to JOODE, here’s mine:


public static final float inverseSqrt( float value ) {
		float xhalf = 0.5f * value;
		int i = Float.floatToIntBits( value );
		i = 0x5f375a86 - (i >> 1);
		value = Float.intBitsToFloat( i );
		value = value * ( 1.5f - xhalf * value * value );
		return value;
	}

That code as you probably know is from the Quake3 src code, however, my implementation has a different initial guess which is slightly more accurate over only 1 loop. Use this instead of the 1/Math.sqrt(…); in Real.normalize(), Vector3.normalize();

Edit, forgot to say, this only works on x86.

Enjoy :slight_smile:

What’s the advantage ? I guess it’s faster, right ?

Yup, faster by a long way.

DP

What is platform specific about it? The docs for Float don’t mention anything platform specific about the bit representations. It’s all IEEE 754.

Im not sure to be honest, i’ll try running it on a 64bits machine and see what the outcome is. Hopefully its the same.

DP

Yeah I am not so sure about those sytle of optimizations. Although great for gaming purposes, it does somewhat undermine any sense of fidelity. Maybe it could be used as an option

Works on x64 swpalmer.

You could have a FastMath class and put optimisations such as this in there (along side sin/cos look up tables).

DP

This code is in Real, but I left it commented out because it’s no longer as fast to do it in integer as in floating point instructions. Esp. for Java, it’s probably slower.

@t_larkworthy: I sent you a PM a while back! Check you PMs! :wink:

Not according to my benchmarks. 1f/Math.sqrt(…); is multiple times slower than invSqrt. It has also resulted in a substantial FPS increase with CPU based skeletal character animation.

DP

Results:

1/Math.sqrt(…); is nearly twice as slow as invSqrt. Thats definetly not something to be laughed at since sqrt is one of the more expensive math operations on the CPU (along with trig).

Benchmark:


public class TestInvSqrt {

	public static void main( String args[] ) {
		// warm up loops
		int loops = 10000000;
		for ( int i = 0; i < loops; i++ ) {
			float v1 = 1 / (float)Math.sqrt( i );
			float v2 = invSqrt( i );

			float v3 = v2 * v1;
		}

		// do the proper loops now
		long nano = System.nanoTime();
		for ( int i = 0; i < loops; i++ ) {
			float v1 = 1f / (float)Math.sqrt( i );
			// to prevent dead code removal
			v1 *= 2;
		}
		long after = System.nanoTime();
		System.out.println( "Math.sqrt: " + (double)( after - nano ) / (double)loops );

		nano = System.nanoTime();
		for ( int i = 0; i < loops; i++ ) {
			float v2 = invSqrt( i );
			// to prevent dead code removal;
			v2 *= 2;
		}
		after = System.nanoTime();
		System.out.println( "invsqrt: " + (double)( after - nano ) / (double)loops );
	}

	private static float invSqrt( float value ) {
		float xhalf = 0.5f * value;
		int i = Float.floatToIntBits( value );
		i = 0x5f3759df - ( i >> 1 );
		// i = 0x5f375a86 - ( i >> 1 );
		value = Float.intBitsToFloat( i );
		value = value * ( 1.5f - xhalf * value * value );
		return value;
	}

}

Edit: Clarified benchmark

DP

Ok, it checks out for pure speed. But two hurdles remain:

  1. Correctness of computed values
  2. Portability to other platforms (and correctness there too)

Also, if it is decided that JOODE should use javax.vecmath throughout, we would rely on the vecmath implementation for vector length (Math.sqrt()):

https://vecmath.dev.java.net/source/browse/vecmath/src/javax/vecmath/Vector3f.java?rev=1.3&view=auto&content-type=text/vnd.viewcvs-markup

Because, as referenced here: http://www.java-gaming.org/forums/index.php?topic=15677.msg125477#msg125477 the biggest slow-down that JOODE currently experiences is in the implementation of the Real class, and its derivatives.

Well, we are using (a slight modification of) Kenji Hiranabe’s vecmath implementation in Xith3D. The difference to Sun’s vecmath is that it is as GC-cheap as possible, though not thread safe. And we can modify it if needed. If you would use this lib, too, there wouldn’t be a problem to make use of the above optimization.

Marvin

It is accurate to 4 decimal places. Iterations over the last line in the algorithm produces more accurate results. If you look at the optimisation, one line has been commented out as I have replaced the initial guess with one that yields more accurate results.

You do realise this is java right?

This is a contribution, take it or leave it, its up to you.

DP

I agree both with Marvin and darkprophet.

I’d find cool if you, biggeruniverse, would use the same version of vecmath (Kenji Hiranabe modified) as us (in Xith3D) so that we can add these optimisations (if you don’t want to change regular sin()/cos()/sqrt() methods we could have fastXXX() ones).

Well, what accuracy is JOODE guaranteeing to the user? Is accuracy v. speed configurable? At what point of accuracy is this slower than Math.sqrt?

JOODE is Java, that code is converted C code. It was written to be close to the hardware, which of course makes me suspicious of it. Has this Java code been shown to work on any non-x86 environment? (It should of course, but better to test these things before they are relied upon to be always correct)

I hope we take it, as it is faster, but first it has to be proven out.

Well, first thing first, let me finish the conversion to pure vecmath. I’ll branch it for easy testing, and then it can be decided what to do about optimizations.

non-x86 platform? So that implies PPC right? Which means older macs and PS3.

If you read above, the Accuracy VS Speed is configurable depending on the number of loops on the last line. As for the accuracy of JOODE, your depending on LCP in the first place, which isn’t highly accurate, but achieves visually convincing results and is stable. If you want accuracy, im afraid your going to have to change all of JOODE.

If you want to re-prove what Quake3 and I (and countless others) already have proved that it is a worthwhile optimisation, then by all means go ahead :slight_smile:

DP

myeah, fairly convincing argument. 4dp is pretty nice. Probably should go for it after a good test

@biggeruniverse
your PM inbox is full! I don’t know your sourceforge UNIX username

o noes! Well, it’s biggeruniverse. I know, I ought to be more creative…

OK your now a developer biggeruniverse