Lowering garbage creation

I’m developping a 3D rendering engine. I have started to track down the amount of garbage allocated per frame in order to get the garbage collector (the concurrent one) more predictable. It is giving good result since I can now get near constant frame rate.

Anyway, in my work it appeared that JOGL is creating quite a lot of garbage for each frame. LWJGL in contrast generates far less garbage (around 600 octets less per frame).

I have joined a refactored output of hotspot option “-Xrunhprof:heap=sites,depth=20” to show the sites which are creating garbage.

Has anyone worked on this ?
Is there a way to limit this behavior ?
Does the JSR address this problem ?

Thanks in advance for your help

          Vincent

A few of these are gone in the JSR-231 branch, specifically the GLContextInitActionPair. However most are still present and are a consequence of JOGL being implemented mostly in Java. The direct byte buffers, JAWT-related Java objects, StructAccessors, etc. are all used and necessary in the JOGL implementation. I don’t anticipate that we will change this because any amount of cleverness in caching these objects is bound to lead to bugs. The amount of garbage is small and is very short-lived, and is cleaned up by the generational scavenger in HotSpot in a fraction of a millisecond. Have you proven that the allocation of these objects is causing problems for your application?

One allocation site is somewhat worrying: the allocation of a WeakReference in java.awt.EventQueue.setCurrentEventAndMostRecentTimeImpl. What version of the JDK are you using?

I use JDK version is 5.04 with options : “-ea -Dsun.java2d.noddraw=true -Xrunhprof:heap=sites,depth=20” (concurrent garbage collector can not be used with runhprof).

I don’t think that these objects have heavy consequences on the steadiness of the global framerate. This is very difficult to say (as long as the amount of generated garbage is below the amount of concurrently collected, you don’t get any visible consequence).
I do agree that caching objects is a good way to introduce bugs.

On the other hand, the more bytes are allocated by the core engine, the less are available to the client application. Therefore I’m trying to reduce the amount of created garbage in the core rendering loop.

            Vincent

One point that should be made is that the concurrent collector operates upon the old generation. Objects that quickly become unreachable are never promoted from the young generation to the old generation during young generation scavenging, so they have no effect on the performance of the concurrent collector.

I understood that the algorithm for the concurrent collector is the following ;

1 - Allocate objects in the young generation
2 - If available memory in the young generation < threshold
2.1 - Collect dead objects in the young generation
2.2 - Increment age of still alive objects of the young generation
2.3 - Move old (age > threshold) alive objects from the young to the tenured generation
3 - If available memory in the tenured generation < threshold start a full gc which will be performed concurrently and will cause 2 pauses, one at the start and one at the end of the gc.

Please let me know if this is correct since I based part of my optimization process on this behavior and the conclusion it leads ; generating lots of garbage, even short lived, has two negative consequences ;

  • small collections will be more frequent therefore leading to a small performance penalty,
  • objects grew older more quickly and are more likely to be moved to the tenured generation, therefore increasing the frequency of full gc.

It appeared that the occurence of a full gc was not acceptable for my needs since it breaks the steadiness of the animation (frame rate drops for a few frames).

From that, two options may be choosen ;

  • either I choosed to implement my own full memory management system (with a realtime VM or with a java system like javolution, object pools,…),
  • either I tried to lower the occurence of full gc, ideally removing any occurence from my main game phase (i.e. I trigger full gc during low interactive phase like level loading, … and I lower garbage creation in the interactive part).

I’m investigating the second option so any clue to lower garbage collection is always welcome.

Thanks

           Vincent

Um, if you’ve already got LWJGL working for your project as well, and that’s creating much less garbage, why not just stick with that and ditch Jogl?

That’s basically correct to the best of my understanding. Note that the survivor spaces (which hold medium-age objects before they are promoted to the old generation) are intended to allow more objects to become unreachable, so that only really long-lived objects are promoted to the old generation.

I don’t understand all the heuristics of promotion from the young to the old generation but they are more involved then a simple age threshold. As far as I remember, statistics are computed to figure out the average age of objects in the survivor spaces and some percentage of the oldest objects over a threshold are promoted.

These assumptions are basically correct. Note that young-generation scavenges should typically be unnoticeable. The aging and promotion process again is more involved and you can actually adjust things like the SurvivorRatio to attempt to decrease the frequency of old generation collections. JDK 5 has automatic tuning which attempts to do this for you.

Reducing excessive garbage production is a good idea, but I think it’s important not to over-optimize too early. Have you tried running any of the Java profilers available in order to see which kinds of objects you’re allocating lots of instances of?