Will be escape analysis based optimizations integrated into Mustang?

Since you can count collection-time to construction time this is quite a valid comparison.
Imagine operations like creating some kind of immutable objects in a loop, like Integers.
Very cheap to construct and almost no way round if the API forces you this way to go - but the masses of small objects count.

lg Clemens

Pooling on short lived, easily created objects has been a net loss for quite awhile in hotspot.

Let the eden-space do it for you. It does it better.

This is probably a very newbie question but just how long is it before my short object makes it out of eden space? And just how long is short. Should i stop using obj pools for things like particles where say 20-40 might be made each frame and last say 200-1500 ms?

Im going to have to disagree with you there Jeff…

We were doing a Doom3 level loading in our game engine and decided to go with the route of adding events to signal addition of objects to a render bin, we noticed an FPS drop by 10fps from 40 (without events) on good hardware to 30. Pooling the objects and using some clever tricks bought the fps back up to 40…

DP

i don’t care. it sounds cool, and if i imagine it working, it feels cool, so i like it. :smiley:

Assuming I have a basic understanding of this from watching the cool jconsole tools and stuff… It works approximately like this :

Your object will sit in the eden space until the eden space is full, then only objects that are still alive are copied out of the eden space into the survivor space and the free memory pointer is reset to the beginning of the eden space. Big objects that don’t fit in the eden space are automatically allocated in the older generation space. The survivor space may not actually exist (i.e. it could be part of the eden space to begin with), but conceptually it is simply there to delay promotion into the older generation for objects that were only recently allocated before eden filled. When an object survives N collections of the eden space it is promoted. Therefore the older generation tends to fill much slower, possibly it never fills and never needs collecting because only a few objects make it there and they are the ones that are alive forever as far as your application is concerned (most likely a few objects ‘die’ in the older generation but a collection is never needed so they just sit there until your program ends).

That’s actually a gross oversimplification… there are all sorts of other ratios and rules to tune the GC and various things depend on the GC algorithm you choose to use, and I certainly don’t know all the details.

The KEY is to use tools to view the memory profile and TUNE things to fit the characteristics of your application. For example the size of the young generation and the old generation will affect collection times and how quickly objects are promoted. you have to strike the most optimal balance you can. The latest GC stuff in HotSpot will actually self-tune to a degree… you just have to tell it what your target collection times are and it can adjust various ratios on it’s own in an attempt to meet that time. This may mean that collections occur much more frequently, but they take far less time on each run… e.g. if you collect every other frame, but the collection only takes 1ms then your game won’t likely be effected… but if you collect every 200 frames and the collection takes 100ms… you just got an ugly bump in your frame rate.

I have been trying a few different approach to avoid the garbage collection making the fps of my engine unsteady ;

  • Limit the amount of created garbage : this can be thought of a good solution but in fact it makes me design my API in an unnatural way . I think this can be kept in mind when writing the code but should not be something that leads to design choice.
  • Use object pools : at first, I thought it was a good solution but in the end I was wrong and removed them ;
    . first, object pools degrade the quality of the API of my engine (this is very important for me to keep things simple an readable),
    . object pool can be slow since they need some sort of explicit garbage collection like reference counting. (For example, with reference counting, a render frame that consists of 3000 of render commands which each holds 1 render state which holds 6 matrices leads you to decrement the references on 1 + 3000 + 3000 x 1 + 3000 x 1 x 6 reference counted object when you collect the render frame !).
    . object pool have a very bad impact on the memory requirements of the application (you need your object pools to be typed therefore you end up with lots of object pools which do not share there free memory area)
    . object pool increase the work needed to maintain your code since they are a source of memory leak.
  • Tune the garbage collector : it did just work. I have an average of 1 ms or less per frame for garbage collection which I consider very low (memory management has to have a cost) and far lower than that was costing me object pools.

Object pools design can be an interesting option for object for which the cost of creation or destruction is high.
I’m still wondering if this is the case with direct buffer (I have some test where it seems that they are the cause of very long ‘other full gc’, but I have not finished profiling this).

There is a good online chapter about GC tuning from Killer Game Programming reference somewhere in this forum.

To return on the main subject, I can see one big interest of escape analysis ; it is the use of the new iteration contruct in Java 5.
I used it everywhere but it was creating lots of iterators and I therefore moved back to the simple ArrayList iteration.
It is not satisfying since it constrain my API to explicitly returns ArrayList instead of List.

Should work just fine for the old iteration constructs too.

Cas :slight_smile:

if you need the best performance, do it like this


for (int i = lengthOfList;--i>=0;)
{
   doStuff();
}

comparisons against 0 are fast because only 1 value has to be loaded into the register (instead of the loop variable and the list size) AND you already got it there because of the “–”.


try  {
while (true)
{
   doStuff(i++);
}
} catch ()

may be faster for very long lists, or on IBM’s vm, which is a lot faster regarding exceptions.

however, in 97% of all cases, the performance doesn’t matter that much (0.5 seconds or 0.52 seconds after a click on a button, who cares)