Ok! I’ve done this quick C++ to Java conversion… 8)
Here are two test functions:
private static void testInvSqrt()
{
Vector3f v = new Vector3f(1, 2, 3);
float sum = 0;
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i)
{
v.x += i * 0.001f;
v.y += i * 0.001f;
v.z += i * 0.001f;
sum += (float)(1.0 / Math.sqrt(v.x * v.x + v.y * v.y + v.z * v.z));
}
long end = System.currentTimeMillis();
System.out.println("1.0 / Math.sqrt(): " + (end - start) +
"ms, sum = " + sum);
}
private static void testInvSqrtNEW()
{
Vector3f v = new Vector3f(1, 2, 3);
float sum = 0;
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; ++i)
{
v.x += i * 0.001f;
v.y += i * 0.001f;
v.z += i * 0.001f;
sum += newInvSqrt(v.x * v.x + v.y * v.y + v.z * v.z);
}
long end = System.currentTimeMillis();
System.out.println("newInvSqrt(): " + (end - start) +
"ms, sum = " + sum);
}
Here is the newInvSqrt() implementation:
static ByteBuffer byteBuf = ByteBuffer.allocateDirect(4)
.order(ByteOrder.nativeOrder());
static IntBuffer intBuf = byteBuf.asIntBuffer();
static FloatBuffer floatBuf = byteBuf.asFloatBuffer();
private static int f2i(float f)
{
floatBuf.put(0, f);
return intBuf.get(0);
}
private static float i2f(int i)
{
intBuf.put(0, i);
return floatBuf.get(0);
}
private static float newInvSqrt(float x)
{
float xhalf = 0.5f*x;
int i = f2i(x); //*(int*)&x;
i = 0x5f3759df - (i >> 1);
x = i2f(i); //*(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
Here is the test invocation code:
for (int i = 0; i < 10; ++i)
{
testInvSqrt();
testInvSqrtNEW();
}
Here is the result:
1.0 / Math.sqrt(): 271ms, sum = 27.828238
newInvSqrt(): 280ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 270ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 161ms, sum = 27.803202
1.0 / Math.sqrt(): 270ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 280ms, sum = 27.828238
newInvSqrt(): 171ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 271ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
I think the code is clear and doesn’t need my explanation… :