Xlib: unexpected async reply, Revisited

I’m currently using JOGL Beta-04 on Fedora Core 4. In the last week I’ve seen “Xlib: unexpected async reply”… about four times. In all cases the problem happened after a JOGL window was up, then covered by some other app’s window, then exposed/uncovered. The problem happens sporadically, i.e. exposing all or part of a running JOGL app’s window(s) usually doesn’t cause any problems, at least not for me.

I see in the forum that some other folks have encountered various incarnations of this. What’s the status of this issue?

Rob

What version of java are you using? A post on another thread indicated that the default Java is GCJ which appears to have locking bugs in its AWT implementation. Have you tried the latest 5.0 update of the Sun JRE?

I’ll be away for a week starting tomorrow so will be slow on replies.

Ken Russell wrote:

What version of java are you using? A post on another thread indicated that the default
Java is GCJ which appears to have locking bugs in its AWT implementation.
Have you tried the latest 5.0 update of the Sun JRE?

We’re using Sun’s 1.5.0_06 for sure.

I wish I had more insight into what causes this problem to happen, but I’m afraid I don’t know any more than I mentioned previously. Once the problem has happened the app quits updating and I have to kill it.

Let me know if you think there’s a way I can gather any useful info.

Rob

As far as I know all of the code paths in the JOGL implementation are covered with the appropriate AWT locking, but it’s possible something was missed. Can you provide a test case for this? If so, please file a bug and attach it.

Due to the sporadic nature of this thing it’s probably going to be impossible to come up with something that’s guaranteed to cause the problem quickly. But… I’ll see if I can invent something useful.

Thanks for the great work.

Ken et. al.:

I did some more pounding on this issue. I decided that due to the sporadic nature of the problem it might be better to see if it happened with one of the canned JOGL demos to get my code out of the loop. So… I built the “Gears” demo, fired it up, then patiently covered/exposed it using another window and waited for the problem. Usually within five minutes or so of doing this the “Xlib: unexpected async reply” problem happened, and the app hung and had to be killed.

I tried this with both JDK 1.4.2-12 and 1.5.0.6, and 1.5.0-7. Same result across those three.

What I have not tried yet is backing off to an earlier JOGL version, e.g. Beta03, or building from the repository source. Time permitting I may try that sometime soon unless you or somebody else on the dev team suggests that if Beta04 fails then Beta03 and the current stuff should also fail.

This may be a silly question, but could this be a graphics driver issue? I’m using nvidia driver 81.78 with JOGL Beta04 on Fedora Core 4 (2.6.14-1.1656_FC4smp). I’ve also seen this problem on machines using earlier drivers.

I’ll investigate more as time permits.

Rob

After some more reading I see that “unepected async reply” is a result of making X calls from multiple threads, which was forbidden in X11R5 and earlier (of course FC4 uses X11R6; there are still lots of ways to mess up multi-threaded access though…). So, I guess the answer to my prior question about whether it could be a graphics driver problem is almost certainly “no”.

I’d be interested in hearing if anybody else with time on their hands could reproduce this problem using Gears or one of the other canned demos.

During testing of JSR-231 beta 5 (coming soon) we discovered that an optimization which was added in beta 4 to the locking performed by the JOGL implementation on X11 platforms was not robust. This optimization was principally added to support simultaneous rendering to multiple screens on multi-head X11 setups.

We’ve backed off from this optimization and made it an option rather than the default. This is now documented in the JOGL User’s Guide in the X11 section. The current nightly build already has this change and I believe it will address your problem. Please try it and let us know whether it does. If so, the issue will be fixed in the forthcoming beta 5.

I grabbed the 2006_06_03 nightly and gave it a whirl. So far no occurrences of the dreaded “Xlib: unexpected async reply” problem. I’ll have our dev team run with this one for a while and will report if we see this again.