# Float Division vs Multiplication

Hi all,

I was having a look at the source code to Quake 2 yesterday and was nosing around the math functions defined there. I came across the vector normalisation function:

``````
vec_t VectorNormalize (vec3_t v)
{
float	length, ilength;

length = v[0]*v[0] + v[1]*v[1] + v[2]*v[2];
length = sqrt (length);		// FIXME

if (length)
{
ilength = 1/length;
v[0] *= ilength;
v[1] *= ilength;
v[2] *= ilength;
}

return length;

}

``````

I was initially confused by the 1 division followed by multiplication, rather than a straight up division. I suspected it might be that multiplication with floats is significantly faster than division, hence 1 division and 3 multiplications is faster than 3 divides. Looks like I was right:

So, I remember JOML is currently doing 3 divisions (since I wrote it before I knew about this) and thought it might make a little performance enhancement to go back and precompute divisions across all functions that can benefit from it. However, when I tried making my own benchmark, I found the opposite to be true, the 3 divides were around 10x faster than multiplication!

Here are the two functions I tested:

``````
public void normaliseFast() {
float length, ilength;

length = x * x + y * y + z * z;
length = (float) Math.sqrt(length);

if (length != 0) {
ilength = 1.0f / length;
x *= ilength;
y *= ilength;
z *= ilength;
}
}

public void normaliseSlow() {
float length;

length = x * x + y * y + z * z;
length = (float) Math.sqrt(length);

if (length != 0) {
x /= length;
y /= length;
z /= length;
}
}

``````

And here is my benchmark:

``````
Vector3f start = new Vector3f(7.f, 10.25f, 3.f);
Vector3f end = new Vector3f(12.0f, 0.5f, 1.5f);
Vector3f fastdir = new Vector3f();

Vector3f.sub(end, start, fastdir);

Vector3f slowdir = new Vector3f(fastdir);

long fastStart = System.nanoTime();

fastdir.normaliseFast();

long fastEnd = System.nanoTime();

long fastTime = fastEnd - fastStart;

long slowStart = System.nanoTime();

slowdir.normaliseSlow();

long slowEnd = System.nanoTime();

long slowTime = slowEnd - slowStart;

System.out.println("Slow: " + slowTime + ", Fast: " + fastTime);

``````

My results are as follows (5 tests):

Slow: 6462, Fast: 225802 (computer was still booting up…)
Slow: 2661, Fast: 25849
Slow: 2661, Fast: 27370
Slow: 4562, Fast: 38774
Slow: 2661, Fast: 26610

Am I missing something here? I was expecting it to be the other way around! Obviously if some enhancements have been made to the JVM then I won’t implement this in JOML.

Hi Neoptolemus,
you seem to not having had a look into JOML for some time.
All such normalizations were changed to 3 multiplications.
Cheers, Kai

Factor 10 difference…? Something else is amiss.

Cas

Ah! I realise I was looking at the wrong repository, as one of my laptops has an outdated shortcut. Oops!

Still weird how I get completely wrong results though, not sure what I’ve done wrong…

I am still subscribed to JOML so I get email notifications. It’s amazing how far you’ve taken it. I think I had pushed it as far as I could when you took over, so it was definitely the right decision

You are always welcome to join me again on our journey to full Java-math-library-world-domination!