for Jeff.... microbenchmarking.

This is for Jeff, since I know how much he !!!adores!!! microbenchmarking…

http://www.osnews.com/story.php?news_id=5602

;D

Actually it is great to see Java doing so well and his tests seemed reasonably fair given the that they are micro-benchmarks.

And his ultimate conclusion is what we all know… that speed is no longer an issue for choosing C or Java.

Now to find a way to improve Java’s trig score…

Not a bad article, he clearly understands the issues…

"Designing good, helpful benchmarks is fiendishly difficult. This fact led me to keep the scope of this benchmark quite limited. … I didn’t test string manipulation, graphics, object creation and management (for object oriented languages), complex data structures, network access, database access, or any of the countless other things that go on in any non-trivial program. "

So he knows that his benchmark is far from the whoel performance story.

Its not that I hate Microbnechmarks, its that I hate their misuse, which is prevelant. I’ve used Microbenchmarks to help ferret out performance issues, you just need to understand what you are and aren’t measuring. And in a complex rum-time optimizing environment like the VM that often means needing access to the VM authors to interpret what you see. (An advantage I had writing the Performance book that I realize most people don’t :confused: )

any clues on the bad trig score?

Havent dug into the benchmark. My first wild guess would be that there is some built in trig support on the intel chip that doesn’t support Java’s rigorous FP definition.

Just out of curiosity, who does game using the supplied trig functions?

Maybe this is dating me and prcoessors have goitten so good this isn’t an issue but back when Iw as doing 2D games we always did a trig table.

[quote]My first wild guess would be that there is some built in trig support on the intel chip that doesn’t support Java’s rigorous FP definition.
[/quote]
No prizes, but that is the problem. When reducing ‘large’ arguments Intel use a 66bit version of pi which is inadequate to maintain the precision required by Java. They also restrict the domain of the trig functions to rather less than the full range of double. Java requires that Math.sin(1.0e300) produce a result accurate to within 1 lsb.

http://developer.java.sun.com/developer/bugParade/bugs/4345903.html

RFE for a FastMath class perhaps? :wink:
(unless ofcourse thats what the bug report linked above is… still havnt got my java.sun.com account sorted :()

Looking at the previous thread “amazing java.lang.math performance” at http://www.JavaGaming.org/cgi-bin/JGNetForums/YaBB.cgi?board=Tuning;action=display;num=1063822912, I see that even with table lookups and fast versions of the functions there was a 3x improvement in performance. That doesn’t explain the more than 10x difference between Microsoft’s implementation of trig functions and Java 1.4.2, does it?

Another one (this time on AMD Athlon XP/MP):

http://fails.org/benchmark.html

Wasn’t there some sort of switch in the Sun JRE to use a less exact but much faster variant of Math?

I’m using lang.Math trig functions in my software renderer but i’m not using many trig ops in the main loop, just in the matrix class. I used both the lang.Math functions and a lookup table in a mode7 rendering applet and I didn’t see any performance difference between the two. Both methods gave the same frame rate.

[quote]Wasn’t there some sort of switch in the Sun JRE to use a less exact but much faster variant of Math?
[/quote]
Your thinkign of the “strictfp” loophole. Its a very very tiny loophole. Just enough to allow the Intel fused multiply and add.

Intel x86 processors don’t have a fused multiply and add (the IA64/Itanium does). These fused instructions maintain additional precision which is not permitted by non strictfp java. The relaxtion which accompanied strictfp only allows additional exponent range during intermediate calculations. This permits the use of the Intel 80 bit reals for temporaries (provided the FPU is running in the mode which restricts the precision to that of normal double/float as appropriate).

There was a JSR (84) which proposed further relaxation to allow the use of fused multiply/add instructions as found in PowerPC for example. Unfortunately this was withdrawn apparently due to problems setting up the expert group.
http://www.jcp.org/en/jsr/detail?id=84

Okie I’ll cop to being confused by all the various register level processor issues. Not my area of expertise :slight_smile:

Thanks for the clarification on whats what.

If you look at his Benchmark.java source code, you’ll notice he simply goes from 1->10million and computes Math.sin, etc. for each increment of 1. If we clamped the range so it fell between 0 and 2PI, would that not return the same values? IE, alter his loop to look like:

        double clampedI = 0.0;
        while (i < trigMax)
        {
            clampedI = i%(2*Math.PI);
            sine = Math.sin(clampedI);
            cosine = Math.cos(clampedI);
            tangent = Math.tan(clampedI);
            logarithm = Math.log(i);
            squareRoot = Math.sqrt(i);
            i++;
        }

Perhaps I’m off, (was up very late last night)… But, if the math is accurate, this gives a very nice performance boost. (from 65s -> 13s on my P4 2.0G machine)

Fire away. :slight_smile:

Yeah, but that’s not really relevant, is it? You’re changing the program for just java, which is cheating. So you’d have to change it for all languages and measure again on all languages.

Its a good program optimization… the kind oif algorythmic optimization that in the real-world often does the most good for your program.

But as a comparison, I agree, you would need to do the same to the C code.

Just points again to the difference between benchmarking and writing real code.

Yep, I agree it’d have to be done across the board. But still, if the optimization brought c++'s trig score from 3.5secs -> 0secs (giving best case) and Java’s trig score from 57secs -> 15secs (the speedup I saw), you’d still end up with c++'s total score being ~ 45 and Java’s about 60 (or slightly better than, or at least comparable to C#)

[quote](or slightly better than, or at least comparable to C#)
[/quote]
If you wouldn’t update the C# benchmark too, maybe, maybe not. But you have to. Again, apples to apples and all… In any way you’d have to measure again instead of just speculate.
As I see it, java’s bad trig score is because of java’s spec being more demanding than the other language’s specs towards trig precision.
Although the benchmark is just a benchmark and not real-life code, it still demonstrates this difference.
You can’t change that by changing the benchmark although I agree that if changing the benchmark to more real life code will make the results closer together demonstrates you must never draw too much conclusions from benchmark results.
For real performance, you might have to use look-up tables anyway although a less precise but fast alternative would be of course very welcome.

The most worrying aspect of the benchmark that’s circulating the net at the moment is the C# to Java performance, not the C/C++ comparison. I’d like to know how M$ have gotten some operations so fast compared to Java.

Cas :slight_smile: