I guess I have a reputation of microbenchmarks and struct-rambling over here…and here we go again
I tried to make an effort of making a struct-framework work for the time being, until Sun ships them in the JIT/VM in Java 7.0 (or later?).
It is basicly a source-code generator and a utility class to handle the ‘sliding window’ behaviour of the structs.
For all those who don’t have the patience to read the whole thing and want to see some performance-benchmarks…
Three datasets of 5000 3d-vectors:
c = a + t * ( b - a )
buffer = 15us // FloatBuffer(3 * 5000), absolute put/get
struct = 24us // my struct implementation
arrays = 40us // float[3 * 5000]
object = 74us // Vec3[1 * 5000]
gasp
First there is the StructGenerator
class:
public static String createSourcecode(String className, Class[] types, String[] names)
Usage:
class Particle
{
float x, y, z;
int state;
}
Class[] primitives = new Class[]{float.class, float.class, float.class, int.class};
String[] names = new String[]{"x", "y", "z", "state"};
String code = StructGenerator.createSourcecode("Particle", primitives, names);
Now we have sourcecode we can compile. This is a one-time effort.
The generated class has a static method that gets us a StructBuffer
for that specific class:
int particleCount = 1024;
StructBuffer buf = Particle.createStructBuffer(particleCount );
The StructBuffer holds the position of the ‘sliding window’. We will use the Particle like this:
Particle p = new Particle(buf);
buf.position(13);
p.x(0.4f);
p.y(0.5f);
p.z(0.6f);
p.state(-5);
buf.position(14);
p.x(0.3f);
p.y(0.2f);
p.z(0.1f);
p.state(42);
So the Particle isn’t sliding over the data, but the data is sliding underneath the Particle.
What if we want two Particles accessing the same dataset?
StructBuffer pBuf0 = buf;
StructBuffer pBuf1 = buf.duplicate();
Particle p0 = new Particle(pBuf0);
Particle p1 = new Particle(pBuf1);
pBuf0.position(11);
p0...;
pBuf1.position(9);
p1...;
// p0.x = p1.y
p1.x( p0.y() );
Once we’re done manipulating the particle-data, we can extract the ByteBuffer from the StructBuffer like this:
ByteBuffer bb = buf.getBacking();
As said above, the performance is 3 times faster (!!) than iterating over an array of “struct-objects” (Vec3) and about 66% the speed of directly manipulating the FloatBuffers. This will only get better once Sun natively implements structs in the VM. It takes however the burden of massive gc() and object-creation.
I’m finalizing the sourcecode at the moment, but performance is kinda stuck at this level (which is nothing to be ashamed about IMO :))
Is this a usable design / framework, are there suggestions how to change things? I’d like to hear your comments.