New C1 in Mustang b59

The client compiler has been updated in Mustang B59, give it a whirl! New things include:

  • Linear Scan Allocator
  • SSE/SS2 support
  • SSA form

Hey ! my old bubbleracer real time raytraced game is now truly real time !

client vm 1.5 => 17-19 fps
client vm 1.6 => 48-50 fps !!!

this is what I call a nice performance boost :slight_smile:

Now what’s coming next Azeem ?

Lilian

[edit] 1.5 server vm => 35-40 fps !!

As a side note, this RFE was pointed out on JavaLobby:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6180872

[quote]As part of the tiered compilation work for Mustang, the HotSpot client compiler needs to be enhanced to update the methodDataOop profiling counters used by the server compiler to generate higher-quality code.
[/quote]
This was also part of b59 of Mustang.

So, some parts of tiered compilation are already in?

Any chance you can comment further on this, Azeem…?

OMG, it has SIMD support, i been waiting for that for years, java is really starting to kick ass, thats another advantage we now have over native compiled languages :slight_smile:

It has SSE instruction support, but no SIMD support

When autovectorization is checked into the jre then you may rejoice.

can we expect SIMD in java 1.6?

Awesome work :slight_smile:

Some numbers from JEmu2, all with sound enabled, throttling & vsync disabled

Solomon’s Key

emulated hardware specs: 2x Z80@4MHz, 3xAY8910 3-channel sound, 3 graphics layers, dynamic palette (4bits per pixel)
1.6 server ~160 fps 1.6 client ~115 fps 1.4 server ~110 fps 1.4 client ~108 fps

Ms Pacman

emulated hardware specs: 1x Z80@3MHz, 1xNamco 3-channel sound, 2 graphics layers, static palette (2bits per pixel)
1.6 server ~225 fps 1.6 client ~205 fps 1.4 server ~215 fps 1.4 client ~195 fps

Check Mate

emulated hardware specs: 1x I8080@2MHz, no sound, 1 graphics layer, black & white (1 bit per pixel)
1.6 server ~253 fps 1.6 client ~193 fps 1.4 server ~234 fps 1.4 client ~194 fps MAME 0.37(debug) ~230 fps MAME 0.37 (no debug) ~700 fps

I didn’t test 16bit games yet (The game Snow Bros. seemed to benefit much more from 1.6 client in a previous snapshot).
Previously I noted that this game ran about as fast as MAME when using 1.6 client, but I forgot to turn off the debugger in MAME, doh!
If you compare with MAME, you can see java has got a long way to go until it reaches C speed for this kind of thing…
The CPU emulators in JEmu2 are highly optimized as well as the rendering stuff. Sometimes even more so than in MAME, but I guess java was not made for things like this…

Nevertheless, good to see these performance gains in 1.6!

One more JEmu2 vs. MAME comparison (getting a little off-topic now, sorry):

Street Fighter 2

Running on an MC68000 @ 12Mhz, sound disabled (so the Z80 and sound hardware are not emulated both in MAME and JEmu2).
java 1.6 server : ~104 fps
java 1.6 client : ~97 fps
java 1.4 client : ~86 fps
MAME : ~153 fps

Note that the SF2 hw doesn’t really run at 12Mhz, but the version of MAME I used had it wrong and set the clock to 12Mhz instead of 8Mhz. In the online version, SF2 runs at 8Mhz (the correct speed) and runs on my machine at about 135fps on the server VM. For the sake of this test I temporarily set it to 12Mhz as well.
Another note should be that the version I used for MAME uses an asm MC68000 core which is much faster than its portable C core.
Last but not least, the video emulation in JEmu2 of this hardware driver is not based on MAME and I suspect that it’s not very optimized for speed; all graphics are decoded and rendered in real time without any caching or anything at all.
Given all that, java performs pretty well here!

Finally some improvements of the client vm.

My Quake III viewer runs around 7-8% faster with the 1.6 jre. About as fast as the 1.4 server vm. Good work!

Tried the b60-build. Paradroidz gained around 18% more performance when using the client VM compared to 1.5 (that’s on an Athlon64 3000+). Great work so far!

With 1.6.0-rc-build60, I’m seeing a 25% speed-up for a heavy duty number crunching module in a visualization application - particle tracking in a heterogeneous mesh. Kudos to the VM Team !

Did some SciMark2.0a benchmarking on my own with the following system:

AMD 64 Venice 3000+ @ stock 1.8GHz and oc-ed 2.8GHz
2 x 512 PDP Patriot XBLK PC3200 @ stock DDR400 and oc-ed DDR560
WinXP SP2

The benches were run without the “large” flag and for a minimum period of 10 sec. A longer time, or, -Xcomp flag didn’t make any significant difference. The 1.6.0 server bench was ran without the -Xcomp flag (more on that later):


CPU @ 1.8GHz, RAM @ 2-3-2-5, 1T, DDR400
                                  <- 1.5.0_04 ->     <- 1.6.0-rc-b60 ->
SciMark 2.0a                      client  server       client   server
Composite Score                   221     369          315      391
FFT (1024)                        107     289          276      364
SOR (100x100)                     424     591          433      585
Monte Carlo                        39     79            66      110
Sparse matmult (N=1000, nz=5000)  200     236          361      210
LU (100x100)                      333     650          442      687

CPU @ 2.8GHz, RAM @ 2.5-3-3-8, 1T, DDR560
                                  <- 1.5.0_04 -> 
SciMark 2.0a                      client  server
Composite Score                   344     575
FFT (1024)                        166     451
SOR (100x100)                     660     919
Monte Carlo                        62     123
Sparse matmult (N=1000, nz=5000)  315     368
LU (100x100)                      520     1011

Nice speed-ups with 1.6.0 ! The sparse matrix multiplication kernel with the server option turns out timings that seem awry. Don’t know what the reason is - 2D arrays ?? With the system overclocked, both client and server exhibit identical speedups that match the CPU speedup.

Azeem ! Is the SSE2 support for AMD64 fully complete in the compiler ? IIRC, not long ago, you said something to the effect that it wasn’t (in the server compiler).

And on using the -Xcomp option in conjunction with the -server option in 1.6.0, there is a gross degradation in performance as below:


> java -server  -Xcomp -Xms256m -Xmx256m jnt/scimark2/commandline 20

SciMark 2.0a

Composite Score: 299.8784709508011
FFT (1024): 227.93827199159992
SOR (100x100):   586.2128746430392
Monte Carlo : 90.42036412631579
Sparse matmult (N=1000, nz=5000): 167.26901426400147
LU (100x100): 427.5518297290492

java.vendor: Sun Microsystems Inc.
java.version: 1.6.0-rc
os.arch: x86
os.name: Windows XP
os.version: 5.1

Uninstalled b60 again. It’s giving me unknown host Exceptions with Webstart where definitely are none and it has some problems with the applets from this site: http://www.tokima.com

SSE2 support for AMD64 is fully supported under AMD64. There’s no SIMD/Auto vectorization support yet

Does that mean you guys are looking at autovectorization for a future release?

Should there be a performance enhancement with 1.6.0-rc-b60 over 1.5.0_04 for, in particular, int mult and div ? Seems quite insignificant.

FWIW, the benchmarks with C version of SciMark2 (equivalent to the Java version I suppose and haven’t checked) with VC++6.0/SP6 ::slight_smile: with the following makefile flags (AMD64 system as described earlier with CPU @ 1.8GHz):
CC = cl -Za -W3
CFLAGS = -nologo -O2x /G6
$(CC) $(CFLAGS) scimark2.obj $(OBJS)

are below ;D:


> scimark2.exe 10

**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using      10.00 seconds min time per kenel.
Composite Score:          177.35
FFT             Mflops:   108.91    (N=1024)
SOR             Mflops:   387.21    (100 x 100)
MonteCarlo:     Mflops:    40.66
Sparse matmult  Mflops:   168.28    (N=1000, nz=5000)
LU              Mflops:   181.67    (M=100, N=100)

Any ideas what will be a good compiler/flags to give a fair trial to the C version ?

Could you please post the stack trace for one of these UnknownHostExceptions? I recently tracked down a problem with a change in Java Web Start in a relatively early Mustang build which was causing problems like this and would like to know if the problem you’re seeing is related. FYI, please see bugs 6228306, 6346071 and related bugs. There is currently no bug filed against Java Web Start because of the regressions caused by 6228306 though I’m still pushing the deployment team to back out that fix.

This is the one I get in latest mustang

java.net.UnknownHostException: javagamesfactory.org
	at java.net.PlainSocketImpl.connect(Unknown Source)
	at java.net.Socket.connect(Unknown Source)
	at java.net.Socket.connect(Unknown Source)
	at sun.net.NetworkClient.doConnect(Unknown Source)
	at sun.net.www.http.HttpClient.openServer(Unknown Source)
	at sun.net.www.http.HttpClient.openServer(Unknown Source)
	at sun.net.www.http.HttpClient.<init>(Unknown Source)
	at sun.net.www.http.HttpClient.New(Unknown Source)
	at sun.net.www.http.HttpClient.New(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doGetRequest(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.actionDownload(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResourceCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResourceCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResource(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResource(Unknown Source)
	at com.sun.javaws.LaunchDownload.getUpdatedLaunchDesc(Unknown Source)
	at com.sun.javaws.Launcher.downloadJNLPFile(Unknown Source)
	at com.sun.javaws.Launcher.prepareLaunchFile(Unknown Source)
	at com.sun.javaws.Launcher.prepareToLaunch(Unknown Source)
	at com.sun.javaws.Launcher.launch(Unknown Source)
	at com.sun.javaws.Main.launchApp(Unknown Source)
	at com.sun.javaws.Main.continueInSecureThread(Unknown Source)
	at com.sun.javaws.Main$1.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
java.net.UnknownHostException: javagamesfactory.org
	at java.net.PlainSocketImpl.connect(Unknown Source)
	at java.net.Socket.connect(Unknown Source)
	at java.net.Socket.connect(Unknown Source)
	at sun.net.NetworkClient.doConnect(Unknown Source)
	at sun.net.www.http.HttpClient.openServer(Unknown Source)
	at sun.net.www.http.HttpClient.openServer(Unknown Source)
	at sun.net.www.http.HttpClient.<init>(Unknown Source)
	at sun.net.www.http.HttpClient.New(Unknown Source)
	at sun.net.www.http.HttpClient.New(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.connect(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doGetRequest(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.actionDownload(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResourceCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResourceCacheEntry(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResource(Unknown Source)
	at com.sun.deploy.net.DownloadEngine.getResource(Unknown Source)
	at com.sun.javaws.LaunchDownload.getUpdatedLaunchDesc(Unknown Source)
	at com.sun.javaws.Launcher.downloadJNLPFile(Unknown Source)
	at com.sun.javaws.Launcher.prepareLaunchFile(Unknown Source)
	at com.sun.javaws.Launcher.prepareToLaunch(Unknown Source)
	at com.sun.javaws.Launcher.launch(Unknown Source)
	at com.sun.javaws.Main.launchApp(Unknown Source)
	at com.sun.javaws.Main.continueInSecureThread(Unknown Source)
	at com.sun.javaws.Main$1.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)

That’s what i’m getting too.