Problem with OutofMemoryError

NVaidya · February 7, 2004, 5:31pm

Well, I was merrily fine tuning the VM parameters to see if I could extract any more juice out of the VM, when I kept bumping into an OOM error. Would appreciate if someone could figure out what/where the problem is: hope to Heavens it is somewhere in my code or logic, and not in the VM (kinda unlikely, right !) and definitely not with Java3D (since the support for it is a little dry now).

The issue seems to be simply that if I run the GC manually before I create and display an object, then I don’t get an OOM error; if I don’t invoke the GC, however, then an OOM error does get triggered. Is that possible at all ? Some more details below:

After I launch my app. and create/delete an object of the same class and run the gc repeatedly, the base level memory for just the app. is about 22M, my Xmx and Xms memory being 128M. Next, I successively create 3 objects and the “used” memory levels are 56M, 91M, and 118M. With the used memory at 118M, I delete 2 of the objects(actually one of them should do I think). The used memory level still remains at 118M or so. Now, with just one object remaining on the canvas, if I try to create a new one then an OOM error happens. If before creating the object, however, I run the gc a few times then the used memory falls to ~85M which is quite well sufficient for me to create that extra object and, expectedly, the creation subsequently succeeds without an OOM error.

I’m using the MemoryMonitor.java code that comes along with the java2d demo bundled with the Java sdk. The used memory, IIRC, is computed as runtime.totalMemory() - runtime.freeMemory(), the former being synonymous with Xmx size I assume. I should also add that there is a memory leak in Java3D for the object parameters that I’m using but that appears to have been taken care of with my own workaround. I say this because after each object creation and deletion, running the gc gets the memory level back to expected base values.

TIA

Edit: Forgot to mention that my j2sdk is 1.4.2_03 and Xmx and Xms are the only VM parameters I’m using. The benchmark machine has 256M of RAM, and the virtual memory is handled by windows.

duckhead · February 11, 2004, 4:02pm

1.4.2 has a known direct memory size limitation (changed from previous VMs) which can be overridden with the -XX:MaxDirectMemorySize flag. See this bug report:

http://developer.java.sun.com/developer/bugParade/bugs/4879883.html

We have also discovered a race condition in the direct memory Cleaner object that can falsely cause an OOM - that appears to have been fixed in 1.4.2_02 so it should not be affecting you.

NVaidya · February 12, 2004, 11:44am

Many Thanks for the info about Bug 4879883. Looks like it might be worthwhile to make it a habit to visit the Bug database :).

The status for that bug appears to be “closed” and looks like the fix might have probably made it to 1.4.2_03 which is what I’m using. I tried Tiger too, and got a Heap Space error. Actually, my code doesn’t have any NIO related stuff per se, but chances are that the Java3D API I’m linking to might !

On an absolutely theoretical note is it possible that VM may throw an OutOfMemory error when there is garbage in a “floating” state, i.e., objects that are live-not-referenced-but-not-yet-claimed.

Still baffled.

duckhead · February 12, 2004, 4:46pm

Supposedly not since the VM would have tried to do a full gc prior to throwing OOM. Have you tried running jvmstat to watch your heap prior to the OOM? The visualgc tool within jvmstat can give some nice info on how often and effective the gc’s are.

http://developers.sun.com/dev/coolstuff/jvmstat/

erikd · February 12, 2004, 5:48pm

I don’t think J3D uses NIO since it also works on java 1.3. Doesn’t simply raising the max heap size help or will it eventually throw an OOM anyway? Maybe your also being plagued by the same J3D object leak that’s plagueing Flying Guns.

[quote]On an absolutely theoretical note is it possible that VM may throw an OutOfMemory error when there is garbage in a “floating” state, i.e., objects that are live-not-referenced-but-not-yet-claimed.
[/quote]
Do you have some link or something at hand about this? It seems like a possible explanation, but what’s a live object that’s not referenced? ??? I would guess the GC should take care of them. :-/

Jeff · February 13, 2004, 2:28am

[quote]On an absolutely theoretical note is it possible that VM may throw an OutOfMemory error when there is garbage in a “floating” state, i.e., objects that are live-not-referenced-but-not-yet-claimed.

Still baffled.
[/quote]
I’ll weaken whatw as siad above slightly and say that it “shouldn’t”. The prmosie is that the Vm makes a “best attempt” to free memory before going OOM. What should * not* be happening is that you get a different result if you call gc(). If it can find it in a call to gc() it shoudl be able to find it when doing an OOM panic gc(). Sounds like a bug to me

Ona different note I’m sure Sun’s last release of J3D uses NIO and native ByteBuffers. It may be that they did somethign clever and actually check for availability and fall back to a non NIO way if NIO is not available.

NVaidya · February 14, 2004, 5:49pm

Need something to keep my reflexes sharp after 2 all-nighters in a row, gazillion cups of coffee, …,…etc

Not that it is related to my problem at hand, how much Xmx space would one need to run the following snippet without an OOM error ?:


            float[] array = new float[8 * 1000000]; 
            array = new float[8 * 1000000];

re-allocation without de-allocating an array

NVaidya · February 15, 2004, 10:04am

TestCase below assigns instances of classes One and Two to a reference of class Object successively. Running the testcase with:

javac TestCase.java
java -Xms64m -Xmx64m TestCase

and without nullifying the reference before the second assignment produces an OOM error.

Seriously, Is this expected of the gc ?


class One {
    float[] array;
    One() {
        array = new float[8 * 1000000];
    }
}

class Two {
    float[] array;
    Two() {
        array = new float[8 * 1000000];
    }
}

public class TestCase {
    static void testIt() {
        Object obj  = null;
        obj = new One();
        // uncommenting this *prevents* an OOM error
        // obj = null; 
        obj = new Two();
    }

    public static void main( String args[] ) {
        testIt();
    }
}

rreyelts · February 15, 2004, 8:29pm

[quote]Seriously, Is this expected of the gc ?
[/quote]
It’s what I would expect.

When you allocate an array, the VM allocate the memory and puts a reference to that array on the operand stack - look up the documentation for the anewarray opcode. (*)

When you assign a variable the result of the allocation, the compiler uses an instruction called astore which is going to store that reference into a local variable slot. That local variable slot is going to hold that reference until you replace it with something else or the function terminates. The garbage collector can not collect anything that is referred to by anything in the operand stack or local variable slots, otherwise it would be collecting objects prematurely. So, when you do your second allocation, the anewarray fails because you still have the first array referred to by a local variable slot. How do you fix this? Change the value of that local variable slot so that it no longer refer to that first allocation - i.e. set the variable to null.

In general, you should know that a single source code statement doesn’t usually equal a single machine instruction.

- There are actually several kinds of newarray and store instructions, but I’m sticking with the reference types to keep this discussion simple.

God bless,
-Toby Reyelts

swpalmer · February 16, 2004, 1:42am

Yep, Like Toby said it is what you would expect at the most basic level. I.e.
the assignment is done AFTER the new… if you are going one instruction at a time you don’t know at the time of allocation that you can throw away the previous array.

However, I wouldn’t be surprised if HotSpot could tweak that code for you when it compiles it to native code. It’s just not all that common, so why bother to put that sort of optimization into HotSpot.

rreyelts · February 16, 2004, 3:04am

[quote]However, I wouldn’t be surprised if HotSpot could tweak that code for you when it compiles it to native code. It’s just not all that common, so why bother to put that sort of optimization into HotSpot.
[/quote]
First, it would have to do some sort of analysis to determine that that variable is no longer referenced. Second, it could easily turn into a deoptimization when the VM has to set a local variable slot to null every single time a reference is used for the last time.

I rather hope the VM implementors are spending their time performing real optimizations.

God bless,
-Toby Reyelts

NVaidya · February 16, 2004, 11:04am

Thanks for the explanation.

Well, I don’t know, but it may not help even if HotSpot does some smart analysis when compiling to native code, because the problem could be triggered even before then in the byte code mode if that’s what the bytecode instructions are.

And with that interlude with those trivial testcases aside, it looks like I’m beginning to narrow in on the issue which I described much earlier. It looks like the problem could be not at the branchgroup level but possibly due to the By_Reference mode I’m using at the geometry level of Java3D. The catch is that the not-yet-garbage-collected objects are reported as “live” by the profiler. And when I do run the gc, those “live” ones get collected and disappear from the count …hmm…kind of a catch-22 situation.

And again on a very theoretical note, and mind you neither bytecode analysis nor JNI are my domains of expertise (atleast not as of yet !), is it possible that JNI code could be coaxed to release memory when the gc is manually invoked, and conversely, would retain memory even there is a gc panic ? Kind of a hazy question I know, but I’m feeling a little suspicious about possible memory retention scenarios with JNI code (my app. doesn’t have any, but possibly that in Java3D) ?

Anyways, I hope to get some time to make a comparison testcase between By_Copy and By_Reference modes, if some more experimentation reveals that the problem is indeed coming from Java3D.

princec · February 16, 2004, 11:25am

I believe that this particular optimisation would be partially or completely solved by escape analysis. The intentions of the Java programmer are clear; the VM ought to implement the programmer’s design efficiently. That’s what it’s all about after all Shame EA hasn’t made it into 1.5 but I’m sure it won’t be long after the success seen in Jet.

Cas

rreyelts · February 16, 2004, 3:51pm

[quote]I believe that this particular optimisation would be partially or completely solved by escape analysis. The intentions of the Java programmer are clear; the VM ought to implement the programmer’s design efficiently. That’s what it’s all about after all Shame EA hasn’t made it into 1.5 but I’m sure it won’t be long after the success seen in Jet.

Cas
[/quote]
Escape analysis is about determining whether an object has lifetime beyond the scope of a function. The idea is that you can stack allocate any variable that was a) allocated in the function and b) does not “escape” the function.

This does not solve NVaidya’s problem. Even with a “stack-allocated” object, the object still wouldn’t be released until after the function returns. You can’t have stack-allocated objects that are released at arbitrary points before the end of a function.

God bless,
-Toby Reyelts

rreyelts · February 16, 2004, 3:57pm

The way JNI works, you have to explicitly manage references. Many local references are implicitly created when you call certain functions. You have to free them with calls to DeleteLocalRef. People tend to get sloppy with JNI and not make calls to DeleteLocalRef, because, when a native function returns, the virtual machine automatically clears all of those local references. This means that, during the duration of a native call, it’s much more likely that the garbage collector won’t be collecting several objects it otherwise could have, because somebody will probably have sloppily left local references laying around.

God bless,
-Toby Reyelts

mthornton · February 16, 2004, 6:08pm

You can apply escape analysis to any block, not just a complete function. Although Java doesn’t release stack space until function exit you could reuse the space. Actually the JVM could release the stack space within a function because it isn’t required to follow the byte code description internally — no one would ever know apart from the reduced memory use.

rreyelts · February 16, 2004, 7:44pm

It can’t just “release” the stack space because of the very nature of a stack. The entire point of a stack is that you can efficiently allocate and release all of the memory in a single operation. If you can dream up some magic algorithm + data structure that lets you perform N deallocations in the same time as 1, let me know.

My point about escape analysis was that you are actually determining whether something escapes a scope - which is fundamentally different than determining the exact point at which the last reference to an object occurs.

God bless,
-Toby Reyelts

princec · February 17, 2004, 8:27am

I believe that at the same time escape analysis is performed that it should be possible to reorder stack allocations where possible to optimise this situation. If you can do it in your head, the compiler can do it too:


{
Object a = new Object();
Object b = new Object();
a = new Object();
...
}

You can figure out how to optimise this case:


Object b = new stack Object();
Object a = new stack Object();
delete a;
a = new stack Object();

and so on. That’s what we’re aiming for. It all comes down to the principle of least surprise. It is indeed surprising that you can get an out-of-memory error doing something as trivial as allocating an image in a memory managed runtime without it attempting to be efficient about what you’ve asked it to do. Replace the Objects above with int[100000000] and in the current VM the code fails unexpectedly with OOME unless you tune the VM, but in the analysed code below it works correctly and requires half the physical RAM. As it should.

You can argue against me on this all you like but you’re only arguing for an inferior mechanism, which would be a strange thing to do…

Cas

princec · February 17, 2004, 8:29am

[quote]fundamentally different than determining the exact point at which the last reference to an object occurs
[/quote]
I understood it that this is exactly what escape analysis does. Within a scope you determine the point an object is referenced; if its reference escapes the scope you can’t do a stack based replacement in that scope. It’s a very conveniently recursive algorithm, cheap to perform, and works excellently with inlining. I hope we get it soon!

Cas

pepe · February 17, 2004, 9:09am

[quote] Object b = new stack Object();
Object a = new stack Object();
delete a;
a = new stack Object();
[/quote]
that would be quite bad. If the new objects reference/access a static context and access to the context is order sensitive, you changed the behavior of the program.