Math.abs() discussion

CaptainJester · February 21, 2006, 12:01am

Here is what I get on my home machine:

Windows XP Pro SP2
AMD Athlon 64 3000+
512MB RAM


0: 15.610084 ns (combined result: 1818930407)
1: 10.661131 ns (combined result: 1818930407)
0: 15.745297 ns (combined result: -409766598)
1: 10.656383 ns (combined result: -409766598)
0: 15.857882 ns (combined result: -1521671903)
1: 10.618668 ns (combined result: -1521671903)
0: 15.650313 ns (combined result: 1713054016)
1: 10.628446 ns (combined result: 1713054016)
0: 15.670707 ns (combined result: 372041738)
1: 10.599672 ns (combined result: 372041738)

Orangy_Tang · February 21, 2006, 12:08am

Second page already and no word from Jeff about the perils of microbenchmarking? Shock! Horror!

Markus_Persson · February 21, 2006, 8:47am

Foo.abs() obviously is a lot faster than Math.abs() on certain hardware. I’m positive(*) there are no serious flaws in the benchmark.

BUT; it’s still totally pointless.

I’m willing to bet my left lung that Math.abs() has never in the history of mankind even been close to being a bottleneck in any kind of program.
Optimisations such as these are dangerous, pointless and a huge waste of time.

Like the 4k competition, I’m in this thread because it’s fun, not because it’s good programming.

(* Intentionally pushing fate. I want to be proven wrong. :D)

Jeff · February 21, 2006, 9:21am

Well… Ive never seen ti be a bottleneck…

But Id be hesitant to state unequivocably you couldnt coem up with a situation where it was…

And anyway its a neat as heck little hack 8)

Markus_Persson · February 21, 2006, 9:38am

I added a new test:

“res += vals[i]/(i+1);”

Addition and (integer) division.
It’s SLOWER than Math.abs().

Test results with vals[100000]: (Average of 100 runs)
 Math.abs(): 14.751644458770752 ns (combined result: 1669388634)
  Foo.abs(): 5.717770094871521 ns (combined result: 1669388634)
   division: 18.024079723358156 ns (combined result: -2079842138)
just lookup: 6.081118493080139 ns (combined result: 1416599130)

So it’s not exactly slow anyway.

swpalmer · February 21, 2006, 2:24pm

I got some interesting results on a PowerPC G4 1GHz (Powerbook)

0: 34.847 ns (combined result: -803147295)
1: 47.583 ns (combined result: -803147295)
0: 41.461 ns (combined result: 67463555)
1: 25.959 ns (combined result: 67463555)
0: 63.427 ns (combined result: -2075964233)
1: 26.964 ns (combined result: -2075964233)
0: 48.791 ns (combined result: -306157959)
1: 61.932 ns (combined result: -306157959)
0: 34.877 ns (combined result: -1075941364)
1: 43.152 ns (combined result: -1075941364)
0: 41.096 ns (combined result: -509972597)
1: 27.036 ns (combined result: -509972597)
0: 41.007 ns (combined result: -1027229940)
1: 26.213 ns (combined result: -1027229940)
0: 34.649 ns (combined result: -1241157218)
1: 26.231 ns (combined result: -1241157218)
0: 34.967 ns (combined result: -1301210477)
1: 25.982 ns (combined result: -1301210477)
0: 35.26 ns (combined result: -686611777)
1: 26.503 ns (combined result: -686611777)
0: 34.647 ns (combined result: 924630738)
1: 43.598 ns (combined result: 924630738)

It seems that on average Math.abs() was significantly faster. but there are still a notable number of iterations where the reverse seems to be true. I wonder if that has to do with the distribution of negative values in the array and therefore how often the branch was taken?

Markus_Persson · February 21, 2006, 2:47pm

Actually, I get the average time for 0 as 40, and the average time for 1 as 35.

Meaning that, on average, Math.abs is slower.
Of course, that’s not a huge sample size.

Try making a version that alternates between all negative values, all positive and random?

swpalmer · February 21, 2006, 3:05pm

Yes… but you threw me off by putting the comments backwards in the source you posted !!!


                    case 0: // New abs
                        for (int i=0; i<vals.length; i++)
                        {
                            res += Math.abs(vals[i]);
                        }
                        break;
                    case 1: // Math.abs
                        for (int i=0; i<vals.length; i++)
                        {
                            res += Foo.abs(vals[i]);
                        }
                        break;

Markus_Persson · February 21, 2006, 3:27pm

Er, oops.

Haha. Let’s pretend it was a demonstration on why documenting what instead of why is bad.

g666 · February 21, 2006, 8:21pm

Another fun question for u oldbies:

would:

return (x > 0) ? x : -x;

be faster in java than:

if(x > 0) return x;
else return -x;

?

PS>OT how do u test the accuracy of a trig function like java apparently without cause extra trig functions that are known to be accurate?

Mr_Light · February 22, 2006, 3:55am

I know damn little about bytecode but I’m guessing in that example both compile to the same.

Raghar · February 23, 2006, 2:17am

Video encoder / decoder.

…

Of course the div and the idiv will be slower than abs. These are, with the sqrt, the slowest instructions on CPU. 80 cycles each (in full precision).

If you’d like to see the reason, you could look into my 256 bit library (It alows 256 bit integer register based computation on 32/64+ bit CPUs.) FFT based division algorithms are better at aprox 1024 bits, and hardware based FFT like implementations could be unnecessary complex for only 64 bit registers.

Of course assembly programmers could use aproximate algorithm at cost of aprox 3 multiplies. In addition situation for small divisors isn’t that bad, because recent CPUs have division table, so small divisors are resolved more easily.

Raghar · February 23, 2006, 2:18am

I would guess that answer on you question would be Taylor series/aproximations.

I remember how I used words “Taylor series” and “that number with exclamation mark” in one sentence.

Anon666 · February 24, 2006, 6:38pm

A related function to Math.abs() - How often have you wanted a value capped at zero


	/**
	 * 
	 * performs the equivalent of Math.max(a,0),
	 * though alot faster.
	 * 
	 */
	public static final int max0(int a) {
		return a&~(a>>31);
	}

chaosdeathfish · February 27, 2006, 4:43pm

OK, slightly off-topic, but while we’re talking about wierd optimisations… Anyone know a decent capping algorithm? My current code is:


int cap( int value ) {
    return value > 0 ? value < 0xFF ? value : 0xFF : 0;
}

Any wierd bitshifting things that can do this?

Markus_Persson · February 27, 2006, 6:29pm

This is about 10-20% faster for me:

return value > 255 ? 255 : value&~(value>>31);

edit:
Ooh, a much better cap to zero was already posted.
I really need to start reading the posts one of these days.

Markus_Persson · February 27, 2006, 6:35pm

There we go:

return value & ~(value >> 31) & ~((255-value)>>31) | (((255-value)>>31)&0xff);

Original method with two ifs (ives?): 10.237750000000005 ns
This method: 4.809813399999991 ns

g666 · February 27, 2006, 6:44pm

Markus_Persson:

There we go:
return value & ~(value >> 31) & ~((255-value)>>31) | (((255-value)>>31)&0xff);
Original method with two ifs (ives?): 10.237750000000005 ns
This method: 4.809813399999991 ns

Does an if compile to the same instructions as ?: i think i read somewherer that it doesnt, and that :? is faster.

Markus_Persson · February 27, 2006, 6:54pm

    private static int cap(int value)
    {
        return value > 0 ? value < 0xFF ? value : 0xFF : 0;
    }

10.895770499999994 ns

    private static int cap2(int value)
    {
        if (value>0)
            if (value<255)
                return value;
            else
                return 255;
        else return 0;
    }

9.652298100000003 ns

chaosdeathfish · February 27, 2006, 8:31pm

Ah! Thanks for that! I’ve tried that out in my code, and… I’m afraid it was slower Even using if instead of ?: was slightly slower.

I guess that proves the maxim that you should always benchmark in your own working code…