Minor optimisation for vertex arrays and a request

Ouch! I cannot assume that my users will switch vm’s
and this is what I see with client vm 1.5.0.06 on linux
when using absolute buffer access:


---> arr:       703ms   47M vertices/s
---> buf:       1596ms  21M vertices/s
---> pnt:       1240ms  27M vertices/s

Roughly 100% difference!
$%#! Marketing!! :frowning:

While the peak performance of the HotSpot client compiler generally isn’t as good as with the server compiler, I have found that for non-trivial inner loops it is possible to get very good performance from it rivalling C/C++ speed. The VertexArrayRange demo in the jogl-demos workspace does exactly this and even back in 2002 achieved 90% of C++ speed. In 2003 we showed a skinning algorithm ported from C++ to Java running at something like 85% of the speed of C++. All of these presentations are archived on the JOGL home page. I wouldn’t make snap judgements based on microbenchmarks.

Sorry, didn’t want to make you repeat yourself :slight_smile:

I got your point, but it happens that in my application I am basically doing
just this: a = a*b, and I do this or something comparably simple a few times
on big sequential data, before I shove it to GL.
So I thought buffers must be ideal for this, and I was just very surprised,
that these ops do not get optimized into the imho pretty obvious x86 asm.
Apparently it is better with server vm, and so I will consider sticking to buffers
and hope that v6 client will eventually set buffers on steroids…
praying to the sun god … ::slight_smile: :wink:

SIMD will propably dwarf any results possible these days in Java.

4 MULs at the price of 1…

Somebody write me a wrapper :slight_smile: