How much 'help' does the gc normally need?

cep21 · July 6, 2004, 6:39pm

When I’m done with a variable, I mark it as null to signal the gc can collect it (that is the correct thing to do right?). What if the variable is an array of objects. Does it help to mark each object as null before i mark the array as null? Or is that just wasted time?

princec · July 6, 2004, 8:04pm

Wasted time. The only time you might want to null something is if you’re dealing with really big objects like images, and need to replace the only reference to the image with another one - allowing the first image to be garbage collected to make way for the new one. (I discovered this fact here on JGO :))

Cas

cep21 · July 7, 2004, 3:18pm

“The only time you might want to null something is if …”

So never mark anything null unless it’s huge? Even if I won’t use it again?

mhale · July 7, 2004, 5:26pm

You will probably find the two articles below very useful. The first one has a section about explicit nulling, and states “But in most cases, it doesn’t help the garbage collector at all, and in some cases, it can actually hurt your program’s performance.”

Garbage collection and performance
http://www-106.ibm.com/developerworks/java/library/j-jtp01274.html

Garbage collection in the 1.4.1 JVM
http://www-106.ibm.com/developerworks/java/library/j-jtp11253/

princec · July 7, 2004, 6:14pm

A quick example of what I meant:

int[] a = new int[10000000];
a = new int[10000000];

At one point there are two 40MB arrays in memory - the first one couldn’t be collected because the new one was constructed while ‘a’ still held a reference to it. This kind of thing seems to occur with images more than anything else as they tend to be large.

Cas

20thCenturyBoy · July 8, 2004, 6:25am

Cas, I’m probably being thick, but where do you do the nulling in your example? And how does it help the gc, when you can’t guarantee when it will kick in ???

20thCB

crystalsquid · July 8, 2004, 6:37am

The example was without Null’ing.
The second array is created, but the reference count to the first array would not (in theory) be decremented until the 'a = ’ is evaluated. the sequence internally would be like this:

alloc 40MB (array A)
refcount A += 1
reference a = array A.

alloc 40MB (array B) <- Eek, 80MB alloc’d at the moment
refcount A -=1; <- A is now free to be GC’d
refcount B+= 1
a = array B;

Adding the line ‘a = null;’ in between will decrement the refcount of the first array before the second alloc, so that when you new the second array, the GC can say ‘I don’t have the spare memory, lets garbage collect’, and lo & behold there is 40MB sitting there that it can reclaim.

As to whether modern VMs would actually do this (though I would take Cas’s word that he noticed an improvement), or even whether you would notice anything with objects smaller than a few MB is another question

swpalmer · July 11, 2004, 12:20am

[quote]The second array is created, but the reference count to the first array would not (in theory) be decremented until …
[/quote]
Java doesn’t use refernece counting - just thought I would point that out in case some newbies get confused with your explanation.

The same point applies though – the first array is still reachable* when the second array is being created.

*Java decides what is “garbage” based on what is reachable, not reference counts. That’s part of what makes it “safe”. It is impossible to have a reference that points to garbage, because having such a reference is what makes the object not garbage in the first place.

Technically for local references the VM could probably optimize the array example above so that it released the first array knowing in advance that it would become unreachable after executing the next line – however it is complicated by the fact that the allocation on the next line could still fail for some reason say an OutOfMemoryError is still thrown because the second array is a bit bigger. The VM has to be sure that the first array wasnt collected while trying to get memory for the second allocation because it would not be able to undo the optimization since the first array might remain reachable depending on how the exception was handled.
So when dealing with very large allocations in a similar context it is probably best to do the explicite nulling.