I did this calculation for every element:
c.x = a.x * (1.0-weight) + b.x * weight
c.y = a.y * (1.0-weight) + b.y * weight
c.z = a.z * (1.0-weight) + b.z * weight
a, b and c where in different data-structures:
FloatBuffer bbA, bbB, bbC;
Vector3f[] vcA, vcB, vcC;
The results:
Running benchmark with 128 3d vecs...
math on Vec3[]: 3.9ms 32200 / sec <---
math on FloatBuffer: 20.4ms 6200 / sec
math on unsafe buffer: 3.3ms 38600 / sec <---
math on unsafe struct: 9.1ms 14000 / sec
Running benchmark with 256 3d vecs...
math on Vec3[]: 8.2ms 30900 / sec <---
math on FloatBuffer: 40.7ms 6200 / sec
math on unsafe buffer: 7.9ms 32200 / sec <---
math on unsafe struct: 16.2ms 15700 / sec
Running benchmark with 512 3d vecs...
math on Vec3[]: 16.5ms 31000 / sec <---
math on FloatBuffer: 74.9ms 6800 / sec
math on unsafe buffer: 14.4ms 35400 / sec <---
math on unsafe struct: 28.0ms 18200 / sec
Running benchmark with 1024 3d vecs...
math on Vec3[]: 31.5ms 32400 / sec <---
math on FloatBuffer: 150.2ms 6800 / sec
math on unsafe buffer: 29.3ms 34900 / sec <---
math on unsafe struct: 54.6ms 18700 / sec
Running benchmark with 2048 3d vecs...
math on Vec3[]: 66.4ms 30800 / sec <---
math on FloatBuffer: 299.4ms 6800 / sec
math on unsafe buffer: 58.9ms 34700 / sec <---
math on unsafe struct: 107.0ms 19100 / sec
Running benchmark with 4096 3d vecs...
math on Vec3[]: 144.1ms 28400 / sec
math on FloatBuffer: 593.7ms 6800 / sec
math on unsafe buffer: 127.3ms 32100 / sec <---
math on unsafe struct: 215.9ms 18900 / sec
Running benchmark with 8192 3d vecs...
math on Vec3[]: 1551.9ms 5200 / sec
math on FloatBuffer: 1212.3ms 6700 / sec
math on unsafe buffer: 276.3ms 29600 / sec <---
math on unsafe struct: 467.8ms 17500 / sec
Running benchmark with 16384 3d vecs...
math on Vec3[]: 3480.1ms 4700 / sec
math on FloatBuffer: 2666.8ms 6100 / sec
math on unsafe buffer: 960.2ms 17000 / sec
math on unsafe struct: 1193.1ms 13700 / sec
* Riven mumbles something about cache misses…
The unsafe struct was implemented as one object used as ‘sliding window’ struct backed by an unsafe buffer.
All the results were averaged over 8 runs, after warming up 8 runs.
Raw performance is only achievable on native buffers if do your own pointer-arithmetic. FloatBuffers are a no-go for performance, not for direct calls, and not for backing a struct. Unsafe ‘sliding window’ structs cut performance roughly in half, but that could be acceptable as it is less error-phrone and just convienient.
If mapped objects would one day be implemented in java, with a ByteBuffer acting as a heap, there would be no need for bytecode weaving and we’d have the same performance as class-fields.