How do I debug this serialization problem?

Behold a bit of the stacktrace:


java.lang.StackOverflowError
	at java.util.concurrent.ConcurrentHashMap$Segment.get(ConcurrentHashMap.java:338)
	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:769)
	at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:268)
	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1106)
	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1509)
	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1474)
	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1392)
	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1150)
	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1509)
	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1474)
	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1392)
	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1150)
	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1509)
	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1474)
	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1392)
	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1150)
	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1509)
	at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1474)
	at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1392)
	at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1150)
	at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1509)
... etc

The error is spurious and a bit rare and I’ve got no idea how to debug it effectively or efficiently.

Cas :slight_smile:

I’ve actually had that same problem before, and I never figured out a way to get rid of it either. I was guessing it had something to do with dropped packets or something, but I really had no idea.

Maybe you should use synchronized!

(just kidding, that’s a joke on myself)

Did you pass an object that implements java.io.Externalizable? (somewhere along the reference tree)

No, there are no Externalizables. There’s a bit of writeReplacing going on and a lot of readResolving and readObjecting and writeObjecting too.

Cas :slight_smile:

Well the first thing I have to ask is did you try increasing the stack size just to see if it was endless recursion for sure and not just a case of really not having enough stack?

This same thing has come up ion the context of the Google Web Toolkit (GWT) as well: http://groups.google.com/group/google-appengine-java/msg/160d6587199aa92b
and at least in one case http://groups.google.com/group/gwt-google-apis/browse_thread/thread/5f10789507803391/51003fbb3f1c8ac9?lnk=raot it seems that increasing the stack size was the recommended “fix”.

I get a couple of pages more of that exception before it finally barfs - it’s definitely an infinite recursion. What I’m most irritated about is that the standard Java object serialization doesn’t detect cyclic graphs sometimes - but only sometimes. Bah.

Cas :slight_smile:

I think Sun should work on a non-recursive algorithm. It’s pretty easy to work around this whole problem when simply stacking (as in java.util.Stack) all found objects, and then popping the stack, handling the object, which grows the stack again. This would allow for nearly infinitely long object graphs, whereas a recursing algorithm will always be very limited.

Well, the length of the stacktrace doesn’t indicate whether it is infinite recursion or a really long object graph. The stacktrace will further be capped to the last 1024 elements.

Let’s say you have your own implementation of a LinkedList, with 100.000 elements, it will crash on you with the very same StackOverflowException, without having infinite recursion.

Let’s just say that because I know what my object graph actually looks like, it ain’t very deep :slight_smile: It’s basically a few objects deep at most. And then occasionally, it gets into this infinite loop and recurses itself into oblivion. Unfortunately the stack overflow error seems to obliterate all my System.err.printlns and of course it happens rarely enough that breakpoints are far more nuisance than help.

Cas :slight_smile:

Shove a breakpoint in the StackOverflowError constructor?

Redirect System.err to a file?

@bleb - already doing that - exception seems to clobber everything. I’m even doing a System.err.flush() after the printlns and no output occurs where I’m expecting it.
@pjt33 - interesting idea. Will try it.

Cas :slight_smile:

Maybe try a WeakHashMap. HashMaps hold onto the keys which may confuse the serialisation…although I’m not sure why…just a guess :slight_smile:

When I hit stuff like this, it’s usually because I think my object graph isn’t very deep. Keep an eye out for sneaky references:

Non-transient references to service libraries, or any object that can reach one. Holding a reference to a thread pool, or a sound clip that references the audio engine will serialize a lot more than you expected.

Anonymous and non-static inner classes. These babies hold hidden references to the parent instance, which may not have been what you wanted.

Collections, especially caches and ring buffers. If you forget to remove an object from a map or list when your done with it, you’ll still serialize it if you serialize the collection.

I like pjt33’s idea of getting into the debugger. If your object graph is too large to browse, you can also try overridding ObjectOutputStream or using an alternate serialization mechanism like XStream, http://xstream.codehaus.org/ to debug what is making it to the stream before everything goes pear shaped. You might find some surprises there.

Good luck, you poor bastard. :stuck_out_tongue:

You might try overriding writeObject(ObjectOutputStream) for various objects along your graph (override it to do nothing except maybe log a message), just to get something in place to cut off the recursion. You won’t be serializing properly, but you might be able to figure out which of your classes is the source of the problem.