Newton-Rhapson sqrt to lwgl math package??

Could it be possible to add this into the standard lwgl package?

This is the c++ way:


float InvSqrt (float x)
{
    float xhalf = 0.5f*x;
    int i = *(int*)&x;
    i = 0x5f3759df - (i >> 1);
    x = *(float*)&i;
    x = x*(1.5f - xhalf*x*x);
    return x;
}

I was recently asking this

And the Lomont way, which is supposebly better, I’m just picking these up from


float InvSqrt_Lomont(float x)
{
float xhalf = 0.5f*x;
int i = *(int*)&x;
i = 0x5f375a86 - (i>>1);
x = *(float*)&i;
x = x*(1.5f-xhalf*x*x);
// x = x*(1.5f-xhalf*x*x); // added precision
return x;
} // InvSqrt_Lomont


I’m not sure if this can really boost anything because of native calls, or even if this can be converted to java, but I think its worth a try.

EDIT: This is obviously in wrong place, if admin sees this please move to lwgl forum.

Seems like a fine addition to me. If you’ve got the time some performance comparisons would be nice in real world situations (eg. normalise an array of 1,000,000 vector3fs).

Cas :slight_smile:

My problem is that I don’t have decent c++ compiler available on my home computer. So it mean it will take at least 2 weeks to get back to home and then get c++ compiler.

Anyone here who has the possibility of doing this??

I didn’t see any reason for it not to be implemented in Java…

Cas :slight_smile:

Umm what? I thought you can’t do “binary logic” in java? Now what exactly have I missed?

For example I have avsolutely no idea how to express

int i = *(int*)&x;

if this is achievable then of course I will do the tests!

IMO you can do this via NIO (direct ByteBuffer with IntBuffer and FloatBuffer attached)… 8)

hmm… I’m not sure I see the point in involving buffers - it’s a simple float-in, float-out method call ???..

Depending on the implementation of FloatToIntBits and IntBitsToFloat, a pure java implementation might also be viable.

— modified:

sorry, didn’t read all posts :-X, the code in java would be:

float f = 2;
int x = Float.floatToIntBits(f);
f = Float.intBitsToFloat(x);

[quote]hmm… I’m not sure I see the point in involving buffers - it’s a simple float-in, float-out method call ???..

Depending on the implementation of FloatToIntBits and IntBitsToFloat, a pure java implementation might also be viable.

— modified:

sorry, didn’t read all posts :-X, the code in java would be:

float f = 2;
int x = Float.floatToIntBits(f);
f = Float.intBitsToFloat(x);
[/quote]
I don’t think this works… when used in the method I get .33295277 from 9…

This is well beyond my knowledge, my initial thought was to call it through JNI, but if someone can do pure or “somewhat” pure java implementation it would be better, wouldn’t it?

Ok! I’ve done this quick C++ to Java conversion… 8)

Here are two test functions:


    private static void testInvSqrt()
    {
        Vector3f v = new Vector3f(1, 2, 3);
        float sum = 0;

        long start = System.currentTimeMillis();
        for (int i = 0; i < 1000000; ++i)
        {
            v.x += i * 0.001f;
            v.y += i * 0.001f;
            v.z += i * 0.001f;
            sum += (float)(1.0 / Math.sqrt(v.x * v.x + v.y * v.y + v.z * v.z));
        }
        long end = System.currentTimeMillis();
        System.out.println("1.0 / Math.sqrt(): " + (end - start) +
                           "ms, sum = " + sum);
    }

    private static void testInvSqrtNEW()
    {
        Vector3f v = new Vector3f(1, 2, 3);
        float sum = 0;

        long start = System.currentTimeMillis();
        for (int i = 0; i < 1000000; ++i)
        {
            v.x += i * 0.001f;
            v.y += i * 0.001f;
            v.z += i * 0.001f;
            sum += newInvSqrt(v.x * v.x + v.y * v.y + v.z * v.z);
        }
        long end = System.currentTimeMillis();
        System.out.println("newInvSqrt(): " + (end - start) +
                           "ms, sum = " + sum);
    }

Here is the newInvSqrt() implementation:


    static ByteBuffer byteBuf = ByteBuffer.allocateDirect(4)
                                   .order(ByteOrder.nativeOrder());
    static IntBuffer intBuf = byteBuf.asIntBuffer();
    static FloatBuffer floatBuf = byteBuf.asFloatBuffer();

    private static int f2i(float f)
    {
        floatBuf.put(0, f);
        return intBuf.get(0);
    }

    private static float i2f(int i)
    {
        intBuf.put(0, i);
        return floatBuf.get(0);
    }

    private static float newInvSqrt(float x)
    {
        float xhalf = 0.5f*x;
        int i = f2i(x); //*(int*)&x;
        i = 0x5f3759df - (i >> 1);
        x = i2f(i); //*(float*)&i;
        x = x*(1.5f - xhalf*x*x);
        return x;
    }

Here is the test invocation code:


        for (int i = 0; i < 10; ++i)
        {
            testInvSqrt();
            testInvSqrtNEW();
        }

Here is the result:


1.0 / Math.sqrt(): 271ms, sum = 27.828238
newInvSqrt(): 280ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 270ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 161ms, sum = 27.803202
1.0 / Math.sqrt(): 270ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 280ms, sum = 27.828238
newInvSqrt(): 171ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202
1.0 / Math.sqrt(): 271ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202

I think the code is clear and doesn’t need my explanation… ::slight_smile:

The code tag skewed my code line indents… >:(

The result of using native Float.floatToIntBits() and Float.intBitsToFloat() instead of NIO:


1.0 / Math.sqrt(): 270ms, sum = 27.828238
newInvSqrt(): 441ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 391ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 381ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 381ms, sum = 27.803202
1.0 / Math.sqrt(): 260ms, sum = 27.828238
newInvSqrt(): 381ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 391ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 391ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 391ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 391ms, sum = 27.803202
1.0 / Math.sqrt(): 250ms, sum = 27.828238
newInvSqrt(): 380ms, sum = 27.803202

Hrm, I thought Newton-Raphson required guessing a possible result, then looping refining your guess until the required precision is reached?

The results I get with the following conversion of the above code is pretty good, accurate to two dp initially, but slipping thereafter:

public static float sqrt(float x)
{
      float xhalf = 0.5f * x ;
      int i = Float.floatToIntBits(x) ;
      i = 0x5f375a86 - (i >> 1) ;
      x = Float.intBitsToFloat(i) ;
      x *= (1.5f - xhalf * x * x) ;
      return 1/x ;
}

What’s the name of this algorithm, and where did you find it?

Amazing performance from the Float.xxxBits methods, eh?

How can it possibly be slower then NIO? It should at the very least be the same.

Good job alexz, I’m getting vastly different performance, in a loop I get
java.lang.Math.sqrt(): 1862
fos3d.util.Math.invSqrt(): 7331

With code like this


long starttime = 0, entime = 0;
            starttime = System.currentTimeMillis();
            for(int i = 0; i<100000000; i++){
                  
                  float f = 1/(float)java.lang.Math.sqrt(9d);            
                  
                  
            }
            entime = System.currentTimeMillis();
            System.out.println("JAVA FUNC: " + (entime - starttime) );
            
            float f = 0;
            long starttime2 = 0, entime2 = 0;
            starttime2 = System.currentTimeMillis();
            for(int i = 0; i<100000000; i++){
                  
                   f = Math.sqrt(9f);
                              
                  
                  
            }
            entime2 = System.currentTimeMillis();             
            System.out.println("My SHIT FUNC: " + (entime2 - starttime2) );


And the results are not what they are supposed to resemple. From 9f i get 2.996579, which is pretty good.

I like this new sqrt method so much I’ve added it to the LWJGL Math class. It’s accurate to at least 3dp and the accuracy can be tuned by simply repeating the last line of the equation over and over as many times as you like.

What’s the theory behind it?

Cas :slight_smile:

[quote]I like this new sqrt method so much I’ve added it to the LWJGL Math class. It’s accurate to at least 3dp and the accuracy can be tuned by simply repeating the last line of the equation over and over as many times as you like.

What’s the theory behind it?

Cas :slight_smile:
[/quote]
Awesome, I wish I could explain.

Theres a discussion on the subject here methinks ::slight_smile:
http://gamedev.net/community/forums/topic.asp?whichpage=1&pagesize=20&topic_id=139956

Ooops… :o

Do you remember my previous result?

1.0 / Math.sqrt(): 261ms, sum = 27.828238
newInvSqrt(): 160ms, sum = 27.803202

This was done on P3 600 MHz (JDK 1.4.1)…

And here is the result on P4 1.6 GHz (JDK 1.4.1):

1.0 / Math.sqrt(): 70ms, sum = 27.828238
newInvSqrt(): 110ms, sum = 27.803202

Magic! Isn’t it? 8)

I get same kind of result, my Athlon 1.33ghz gets ~1100ms from math square and about ~7400ms from inv sqrt.