GC Bomb - CMS GC starves high priority system threads

The garbage that’s being generated by the JProgressBar in windows l&f is likely
to be the image cache that components in win l&f use when rendering.

I’m not entirely sure how it works in swing but they try to cache some of the
pieces of the componets depending on its state. Think of it this way: the L&F
code asks windows to render a button to an offscreen image, and then uses that
image to display the button. This image is cached. There are several states of the button
(rolled over, disabled, etc), so there are multiple images per component. But when a
button is resized, some of those images become invalid and has to be thrown away…

Since the progress bar is constantly changing they’re probaby generating and throwing away a
lot of images.

Now those images’ native resources are disposed of using reference queue which
is ran on a special Disposer thread (you can see it in the stack trace as Java2D Disposer).
This is done to avoid the use of finalizers and speed up the disposal.
Note that this disposer thread is a high priority thread, which may caused the starvation of
other threads if plenty of garbage is generated and the thread is busy getting rid of it.

[added] Also, if those images were accelerated (that is, cached in vram which they’re likely
to be), then the disposal may involve DirectDraw/Direct3D, wihch means some global system-wide
locks may be taken by the DirectX, which could worsen the situation…

I’ve pointed swing folks at this thread, may be they can shed more light on why the jprogressbar’s
caching scheme is so suboptimal in this case.

OK, I’ve tried several microbenchmarks to hammer the GC, including SoftReferences, WeakReferences, finalizers, and even BufferedImages and calling getGraphics and failing to dispose of it.

In all cases the CPU shoots up to 100% as expected but the jerking does not occur. This means that it’s not a trivial GC issue, it’s the interaction of the GC with some other part of the system which is causing the problem. It might possibly be something to do with the JIT.

All this Swing stuff and AWT etc. are red herrings, as I say, Eclipse makes no use of any of it whatsoever.

Cas :slight_smile:

I think the thread priority of the Java2D Disposer needs to be reworked. It appears to be having a catestrophic affect on the system. I’ve selected the CMS collector to reduce pause times, but I’m guessing that during the sweep phase of the CMS collector the Java2D Disposer is signalled and very nasty stuff happens that bring down the system for 100+ms.

Why must he Java2D disposer run at a high priority? Better yet, can I hack things to lower the priority? Hmmm.

Cas, I think your experiment might have failed because it did not create work for this Java2D disposer.

I had hoped to - I created tons of RGB BufferedImages and got the graphics out of each one and rendered a black rectangle into it.

But it’s still a red herring. SWT doesn’t use the Java2D disposer, yet Eclipse is one of the worst sufferers of the jerking.

Cas :slight_smile:

[quote]Why must he Java2D disposer run at a high priority? Better yet, can I hack things to lower the priority? Hmmm.
[/quote]
Because otherwise a thread which creates tons of images will starve the collector and we’ll get OOE.
This is a bug the introduction of the Disposer had addressed.

If you can build the mustang workspace, you can easily experiment with the disposer.
The file is j2se/src/share/classes/sun/java2d/Disposer.java .
You don’t need to rebulid the whole workspace, though: you can just hack it, compile the file
to some directory and prepend it to the bootclasspath.

Dmitri

PROBLEM SOLVED

(Would appreciate if you could escalate this for me Trembo/Jeff)

I got suspicious about the whole thread priority issue so I looked at javaw.exe through PView on NT, and lo and behold, there were several time critical priority 15 threads running!!! DUH!!! I’m not sure what nincompoop decided that messing around with the thread priority mappings between JDK major releases was in any way wise at all.

Now it occurs to me the very same problem manifested itself to me when my games stopped responding to mouse and keyboard input on some systems. I used to run the main loop at Thread.MAX_PRIORITY but on Java 5.0 the main loop was mapped into time-critical priority - so DirectX never got to poll the bloody keyboard and mouse!

The fix is simple: remap the offending Java priorities to more reasonable NT ones:

-XX:JavaPriority10_To_OSPriority=10 -XX:JavaPriority9_To_OSPriority=9

Please patch this in the next minor update of 1.5.0 as it has such terrible and wide-ranging mysterious consequences.

Eclipse now runs better than it has ever, ever done - it feels utterly smooth and native now. I’m flabberghasted at the transformation from powerful-but-clunky-and-jerky IDE into butter-smooth ultraIDE.

Cas :slight_smile:

After a little research what I find interesting is that priority 15 is in a different priority class!

On Win32:

NORMAL is priority 8.

For a Win32 process the thread priorities can only be adjusted +/- 2 within a priority class. So theoretically if java.exe is running as a normal priority application the max thread priority would usually be 10 anyway. That would correspond to setting the thread to a relative priority of HIGHEST within a NORMAL priority class.

It seems though that the JRE is using the TIME_CRITICAL setting to rail the priority up to the highest possible for non-realtime processes (15). (You need to have admin rights to go above 15 and run as a realtime process.) Interestingly enough, java at priority 15 was able to starve our driver thread at priority 31… that either says something about the quality of the windows scheduler, or there are icky things happening with driver calls and/or system-wide locks in the java threads.

The first thing to suspect of course whenever an application causes stuff to starve/freeze is the graphics drivers. They are notorious for being very unfriendly to the system. We are having a hell of a time finding systems that don’t lock up after a few hours. Even with noddraw=true the graphics drivers will get stuck in an infinite loop occasionally. What is quite cool is that I got a real error message from Windows the last time this happened. It actually said in plain english (after a reboot) that the nVidia driver (nv_disp) was stuck in an infinite loop.

See http://emea.windowsitpro.com/Windows/Article/ArticleID/302/302.html

Congratulation Cas.

BTW how Eclipse behaved when set into lower priority?

Due to the craziness that is NTs priorities, if you set java.exe to, say, idle priority, you get threads at priority 2 and 4 etc. right the way up to 15 :slight_smile: It just spreads them out more.

Cas :slight_smile:

[quote](Would appreciate if you could escalate this for me Trembo/Jeff)
[/quote]
Please file a bug and include all this information…
I’ve asked vm folks to look at the thread, but a bug is always better.

Dmitri

HotSpot has a deliberate mapping of Java priorities to Windows thread priorities where Java’s maximum priority maps to THREAD_PRIORITY_TIME_CRITICAL and its minimum priority maps to THREAD_PRIORITY_LOWEST. See src/os/win32/os_win32.cpp, java_to_os_priority, in the Mustang HotSpot sources if you’re curious. It’s a simple change to downgrade the maximum possible setting to THREAD_PRIORITY_HIGHEST, but is that the right thing to do for all applications? It seems to me that the application should simply not be using the maximum priority (or, in fact, adjusting thread priorities at all) unless it’s absolutely necessary. -XX:-UseThreadPriorities is the “big hammer” switch added to HotSpot at one point because of problems like this.

Are most of the issues seen on this thread caused by exactly this problem (e.g., swpalmer)? As far as I understand it, this isn’t the root cause behind the glitches in mouse pointer tracking that have been seen in other applications like Eclipse and NetBeans on some machines which I consider to be a more serious problem (i.e., possibly not an application-level problem).

This is precisely the root cause of the problem. No Java threads should be able to run as a time-critical thread on Windows; it buggers the whole system up, and there’s not even any security on it. It’ll make a DoS attack trivial. So perhaps this issue should be escalated a bit :slight_smile:

Cas :slight_smile:

As far as I understand, you are not able to set priority higher on thread that it is on threadgroup and you cannot modify it at all without correct permissions - so it should be enough for avoiding DoS in applets.

I dunno about anyone else but I’ve always wanted to attach some actual security to “trusted” code as well to prevent it monkeying around with my system and rendering it unusable by accident… there’s already two great limiters in the JVM right now which limit its Java heap space and also C heap space, but there’s no limits on creating any other OS resources. It seems a little disjointed.

Cas :slight_smile:

This is the bug 5101898 in Sun’s bug database. See:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5101898

See also discussion in

http://forum.java.sun.com/thread.jspa?threadID=666691&tstart=0

Hurrah! Fixed in Mustang. And, apparently, 1.5.0_06.

Cas :slight_smile:

I saw that this was fixed in Mustang… but where did you get the info that an _06 build was coming with the fix for Java 5? Heck, I just got 1.5.0_05!

Rumours on teh intarweb.

Cas :slight_smile:

Well, doesn’t Alan Bateman mention that it’s been fixed in _06 in the second posted thread?

http://forum.java.sun.com/thread.jspa?threadID=666691&tstart=0

see the last post.

Dmitri

Until it’s officially closed in Bug Parade, it’s a rumour on teh intarweb :wink:

Cas :slight_smile: