Escape Analysis?

I’m a bit confused. From what I remember, escape analysis was supposed to automagically solve many of our temporary small object woes (as per http://www.ibm.com/developerworks/java/library/j-jtp09275.html) in the Java 6 release. But I’m not seeing the improvements. I’m working on JBox2d (a physics engine: http://www.jbox2d.org), and we are overwhelmingly bottlenecked by Vec2 creation costs, which was one of the things I was under the impression escape analysis would fix, as these objects are extremely short lived, mainly just used for shuffling pairs of numbers from place to place. Frankly, things seem to be performing just about the same as in 1.5. There still seems to be almost a 2:1 performance difference when I fully inline temp vector creations in realistic benchmarks.

What’s going on here? Was escape analysis cut from the JVM, or am I misinterpreting what it should be doing for me? I’m quite happy to inline all that stuff by hand if need be, or reuse a few static vectors, but that IBM article seemed to strongly advise against that type of stuff (the author all but implied that anyone that would consider it is a freaking idiot).

afaik, its not been default activated yet

Apparently it’s in, but it looks like you need to add some VM arguments (and maybe it’s only for the server VM?):

Riven did some good benchmarks showing the problems:
http://www.java-gaming.org/forums/index.php?topic=14940.0
He tested on java 6 (Mustang), but didn’t use the VM args given in the above link (I think…)

By the way, it’s great that you’re working on optimisations to JBox2D

That guy should be really more careful about using silly abbreviated variable names…

Cas :slight_smile:

interesting. (even if its not used to speed up object deallocation)
I recently asked in the comments of this blog enty if the server compiler would provide enough information to recognise “cheap garbage” and got as answer that this technique [escape analysis] won’t have any performance gain with the current memory model (read the comments).

It seems even sun VM devs are not up to date with the -XX: flags :-\

It was mostly autogenerated code. :stuck_out_tongue:

Thanks, guys - going through some of those links, I came across http://developers.sun.com/learning/javaoneonline/2006/coreplatform/TS-3412.pdf, where I found the following quick summary:

[quote]Implementation Status

  • Java SE 6 has escape analysis and lock elision in the server compiler.
  • It is off by default, it can be enabled with the -XX:+UseEscapeAnalysis flag.
  • Java SE 7 will have further optimizations.
  • There are currently no plans to release a client compiler with escape analysis.
    [/quote]
    That last one is a shame - that means that as advanced and helpful as these optimizations are, they will be absolutely useless as far as game programming goes, right?

Well, unless you ship with the server VM, and then suffer horrible startup.

But… where does the tiered compiler fit in?

Cas :slight_smile:

That was a great set of slides, thanks ewjordan.

Cas: the slides have info about escape analysis. Also Clemens (LinuxHippy) posted this, apparently it’s in java 7: http://www.java-gaming.org/forums/index.php?topic=16653.0

Yeah I know - but, if they are not adding EA to the client VM… but they’re axing the separate client and server VMs… what’s actually going on here in the roadmap? I mean, we all know the ideal situation would be one single VM with EA, but it’s not really clear if or when that’s going to happen.

Cas :slight_smile:

I’m starting to feel like a standard refrain is emerging when it comes to problems of any sort in Java, especially the ones that tend to plague game development:

  1. “There is no problem. Your microbenchmarks are just wrong”
  2. “Okay, there was a problem, but it’s already fixed.”
  3. “No, really, it’s fixed - and your new microbenchmarks suck, too.”
  4. “Alright, it’s not quite fixed, but it’s in the next JVM.”
  5. “On second thought, never mind. It’s not a problem again. Look, we’ve got a great new feature for processors with 64 cores, isn’t that better?”
  6. “Screw you guys, real Java programmers don’t care about this. Go use C# if you want to make games.”

Maybe I’m being a bit cynical and overly harsh (definitely am, sorry!), but looking at the relative performance increases between the Flash player and the JVM over the past few years, I’m starting to cross my fingers and hope that the Flash VM does in fact start supporting Java code as was rumored on another thread, if only because it’s a platform where the desktop consumer is considered worthy of optimizing the VM for. I’m getting ahead of myself, for sure, because the Flash VM is still quite a bit slower than the Java one (and it definitely doesn’t have escape analysis), though the gap is closing (when I get around to running them I’ll post some current test results in another thread).

Sorry to vent, I’m just getting frustrated from the constant choosing between maintainable and fast code, especially when this particular problem (small object overhead) has several different solutions and Sun seems reluctant to implement any of them. Maybe what they say is true, and I keep trying to write Java code to do things that are better done in C++ (math, physics, and finance), but I’m really hoping that’s not the case, because Java is a lot more fun to code and a lot easier to deploy!

Does anyone know much about the internals of EA and why it might be more difficult or expensive to implement on the client VM?

judging from bugparade your not too far of the mark, then again james gosling said

http://www.parleys.com/display/PARLEYS/The+Closures+Controversy
10. Don’t fix it until it chafes
“Just say no until threaten with bodily harm”

Ok he said later this:
http://blogs.sun.com/jag/entry/closures :wink:

Ken Russel might be able to give us some pointers. By the looks of the slides you posted though, it seems like the problem is with mixing VM enhancements. With method inlining, it’s harder to figure out if a variable escapes or not. At least that’s what I thought the slides said…

I thought it’d be easier… after you’ve performed all your inlining to whatever depth, you’d then perform escape analysis on the resulting code. If any methods have already had EA performed on them before inlining, so much the better because then that code doesn’t need analysing.

Cas :slight_smile:

Hi again,

I’ll try to answer some questions to my best knowledge … but if I am wrong, I am wrong :wink:

The idea of the client-compiler is to do cheap optimizations so that code can be compiled fast - it wouldn’t make much sence to add an expensive (and optimistic) optimization like EA. The server-compiler is the place where it belongs to and with the tired compilers you should be already be able to benefit from it with the JDK7-ea builds.

EA != Stack allocation. EA is a step to gather information which can be used to du stack allocation and several other optimizations.
So EA is implemented and already some simpler optimizations which use the information gathered by EA, but stack allocation is not done till now.

Currently there seems to be some work on using EA for doing scalar replacement, something which should help to remove some of the memory preasure 64-bit systems suffer from (also a win for 32-bit systems) :slight_smile:

lg Clemens

They can be sceptical all they like - it’s already implemented in Excelsior JET and it works.

Cas :slight_smile:

They didn’t say it wouldn’t work. But they are sceptical if the “cheap to collect object detection” would provide any gain in overall throughput compared to the compacting technique.

The only advantage I see is the decreased pause time (even if it may run slower or with equal speed) but It may result in longer compile times too (->pauses).

IMO the main reason why there is a attempt to implement EA in the server compiler is concurrency. A lot of tricks are possible if you are able to detect thread private resources (eg allocation directly into registers, removal of locks…).

But don’t get me wrong I am also very curious how it would perform.

Well - I seem to recall the JET guys got EA working and stack allocation in linear time, meaning its impact on actual compilation performance should be minimal. I’m rather hoping that EA and stack allocation make it possible to start using the enhanced for loop and so on without creating tons of unnecessary garbage.

Cas :slight_smile:

Well I don’t know how serious I should take this post.
Have you ever benchmarked flash9’s VM?
Sure there have been GREAT improvements compared to older versions of flash - but only because older versions of flash did not have a JIT at all.
So in fact Flash9 adds to Flash what Java has since about … well I guess it was 1.1.7 when Sun shipped it with the Symantec Just-in-Time-compiler.

If you would have seriously benchmarked the flash-vm, you would see that it cannot compete not even with the client-vm, not speaking about the server-jvm at all.
The client-jvm does not do too fency optimizations, but has undergone major tuning over the past years - and also benefits from other runtime improvements that were done to the rest of the runtime. And the server-compiler itself generates code better than the .NET JIT without any question.

I don’t know wether the tired compilers will be default anytime soon, but I think this direction is right.
A focus could be to make the client-compiler compile even faster (at the expense of the quality of the generated code), because the server-compiler will be there anyway to optimize the really hard stuff.

lg Clemens

Sure, the question just seems who will implement it in Java.
I see ongoing bashing of Sun for not implementing stack-allocation - but on the other side, nobody else takes the work to do it now that Java is open-source. It seems its just not important enough :wink:

By the way the team that implemented initial EA and stack-allocation is located in Linz/Austria (40km away from my home) … it seems they already did implement a prototype: http://www.usenix.org/events/vee05/full_papers/p111-kotzmann.pdf

lg Clemens