Yet another speed comparison,weird server results

[quote]Secondly, its not easy to just copy and paste the code.
[/quote]
I didn’t seriously expect it to be that easy :slight_smile:
Well, at least we know what the situation is regarding this, although it’s a little bit disappointing:

  • The server’s non SSE2 FP code is (and I quote) ‘ineffeciently coded’.
  • It won’t be improved because Sun thinks SSE2 supporting CPU’s have the future, regardless of what the majority of people uses right now.

I am not really affected by this bug because I rarely use any doubles and I my games are usually run on the client anyway. And of course this benchmark puts heavy emphasis on the bad double performance so in real life it’s not as bad as it looks here.

Welcome Ajiva. Nice to see SUN engineers taking part in this forum. :slight_smile: Even more nice that you like games.

PS to the fellow readers: Please have a look at Ajiva’s weblog, which is informative. (Also clickable via his profile)

He’ll be sick to death of us in no time :wink: Eventually he’ll have to change his name and go live under a rock.

Cas :slight_smile:

[quote]Fair enough…
Can you talk about what IS in the 1.5 betas? In other words can you confirm that escape analysis for instance isn’t in the current 1.5 betas… that way at least you aren’t talking about “future” work and we can get an idea of what (not) to expect.

Basically it would be great to get an idea of the performance enhancements that are going into the 1.5 VMs. Hopefully there are some :).
[/quote]
Neither Escape Analysis nor Tier Compilation is in J2SE 5.0

These are just off the top of my head:

Server

  • Trig speed up (Solaris SPARC and Solaris x86 only)
  • instructions similar to this (long = int & 0xFFFFL ) This was
    done for the cyrpto folks who do alot of unsigned work
  • Inlining/Loop Opts improvements
  • Startup improvements via Class Data Sharing (this is client and server)

Tons of bug fixes :slight_smile: J2SE 5.0 is the most stable VM yet…

Progressing 3.5 versions in 1 release, I would hope so :slight_smile:

Hmm… most enhancements are for the Server VM… the VM that is basically missing from the client side :(… where the games run… Or did they fix the fact that the server VM is not installed as part of the JRE. (only in the JDK last I checked)

Nice to see some of those in any case. How come the trig optimizations are specific to Solaris? That is, if it effects Solaris on x86, does the same optimization not apply to x86 elsewhere?

[quote]Hmm… most enhancements are for the Server VM… the VM that is basically missing from the client side :(… where the games run… Or did they fix the fact that the server VM is not installed as part of the JRE. (only in the JDK last I checked)

Nice to see some of those in any case. How come the trig optimizations are specific to Solaris? That is, if it effects Solaris on x86, does the same optimization not apply to x86 elsewhere?
[/quote]
The way the trig functions work, is that when you call out to sin (for example), the VM use to execute a JNI call to a C++ library and return the resulting value. This is slow, so to speed it up, I now short cut the whole thing, and have the VM recongize that a direct C++ call can me made instead. So this is great, except that the compilers we use for Windows (VC6++) and Linux (GCC 3.2) have an aliasing problem with the way the C++ code is structured. So we had to turn down the optimizations for those platforms. The Sun Compilers do not have this problem, and therefore we can crank up the optimizations.

And no server does not come with the JRE…

AFAIK, the C++ library trig functions just do the Taylor series expansion - which could be done in straight Java code & execute almost as fast as the C++ one - probably faster than the JNI call on all platforms.

  • Dom

[quote]…the compilers we use for Windows (VC6++) and Linux (GCC 3.2) have an aliasing problem…
[/quote]
You aren’t using VC7? Is there a reason for that?
Better yet you should use the Intel compiler - it performs much better than VC6 - particularly with floating point operations. Though it compiles much slower. I know of at least one commercial project that compiles their release builds with the intel compiler because they see a significant performance boost. They are making a visual effects (processing movie frames) number crunching application that needs every bit of speed it can get.

But crystal squid has an interesting point. Why not a pure java implementation? Should it not get compiled to the same basic instructions as the C-library code, at least close enough that the JNI overhead saved would be more than the difference?