Oh for the love of … the things I cannot unsee!
Sadly the image is broken now
Maybe I could afford you, too
Cas
Image back! Honestly, it’s the only reason I brought the server back up.
Changes:
- added support for .map(address, capacity)
- added support for user-defined default constructor (as opposed to crashing!)
public class MappedVec3 extends MappedObject
{
public float x, y, z;
public MappedVec3()
{
this.x = this.y = 13.37f;
}
}
MappedVec3 vec3s = MappedVec3.map(address, capacity);
if (vec3s.x != 0.0f)
throw new IllegalStateException();
vec3s.runViewConstructor();
if (vec3s.x!= 13.37f)
throw new IllegalStateException();
vec3s.view = 1;
if (vec3s.x != 0.0f)
throw new IllegalStateException();
Will definitively try this out when it’s released with LWJGL! =D
PS: Exactly when will it be added to LWJGL? =S
Edit 2: Couldn’t hold my breath any longer so I tried it out. Managed to get it working, but what the heck is up with the fork()-method?!
It relaunches the application, within the same JVM, with a new classloader that transforms all classes before they are loaded.
It is an alternative for the Java Instrumentation Agent, which I expected would be a bit too much to ask for most developers, and that would probably not work with JavaWebStart (whoever is still using that piece of tech).
I’m really curious regarding speed improvements, is anyone busy with making a benchmark?
Mike
Field access is typically 10% slower due to the JVM unrolling codeblocks fewer times (Spasi discovered that). The real win is that you don’t have to pump all data from your object graph to your buffer every frame. That might seem like taking away minor overhead, but it allows for major speed improvements.
You can do everything my lib does by rewriting all your code to be like buffer.get(…) and buffer.put(…, …), it isn’t anything magic under the hood.
I managed to figure that out by looking at the source code, but maybe you should mention that in the first post. I thought it was confusing but I might just be dumb…
http://indiespot.net/files/published/mappedobject-0.9.jar (view source)
Main-class: org.lwjgl.util.mapped.TestMappedObject
JARs asm-3.2.jar and asm-util-3.2.jar (asm.ow2.org) must be on the classpath.
Changes:
- added IllegalAccessError (when read-only fields are assigned) at transform-time (loading the class and transforming it), as opposed to runtime-time (when the faulty field access actually occurs)
Bugfix:
- Solved verification error that occured if the callsite of the transformed bytecode contained code that threw an Exception.
I could not reproduce the verification error on my system, so thanks to Spasi for investigating this issue and providing a workaround.
Nice! Another update!
- Did you know that it fails on private/internal MappedObject classes?
- Is there any way to disable the output? Kinda annoying when it floods the output window…
Also, I have a small “problem” with my particle engine test. MappedObjects sure eliminates the buffer operations that were actually bottlenecking the whole thing, but each particle also have data completely irrelevant to the rendering, and submitting 2-3x more data to the graphics card doesn’t seem like a very good optimization. Data I don’t want to send are things like the total lifetime, life left and current speed of the particle. It would be awesome if some of the data could be automatically stored outside of the buffer still but in memory for for each MappedObject “struct”. Obviously I can do this myself by keeping a separate array with the other data, but it feels like I’m defeating the purpose of it all. I’ll try that out this evening (I’m in Japan, so it’s 20:00 here xD).
Were my questions so irrelevant that they do not need any answers? Sorry.
Nice! Another update!
- Did you know that it fails on private/internal MappedObject classes?
- Is there any way to disable the output? Kinda annoying when it floods the output window…
Also, I have a small “problem” with my particle engine test. MappedObjects sure eliminates the buffer operations that were actually bottlenecking the whole thing, but each particle also have data completely irrelevant to the rendering, and submitting 2-3x more data to the graphics card doesn’t seem like a very good optimization. Data I don’t want to send are things like the total lifetime, life left and current speed of the particle. It would be awesome if some of the data could be automatically stored outside of the buffer still but in memory for for each MappedObject “struct”. Obviously I can do this myself by keeping a separate array with the other data, but it feels like I’m defeating the purpose of it all. I’ll try that out this evening (I’m in Japan, so it’s 20:00 here xD).
Your Particle class needs its own “non-rendering” data and a single instance of a MappedObject of some sort that is a window into the “rendering” data maybe. This is maybe not the most efficient way to do it though…
Cas
Were my questions so irrelevant that they do not need any answers? Sorry.
The best way to improve the performance of a scenegraph is to stop using one and change to spatial partitioning. Other than that, converting tree/graph like structures to be cache obvious is somewhat of a pain and for most people not worth the effort.
MY GOD. I DON’T BELIEVE IT. I rewrote my old particle engine test to eliminate some other boring bottlenecks, so it’s definitively not directly portable to a game anymore. It’s more of a benchmark for exactly what MappedObject is supposed to optimize. Guess what? It f*cking did. xD
With 250 000 particles:
Traditional puts: 106 FPS
MappedObject: 180 FPS
With 1 000 000 (THAT’S ONE MILLION DOTS):
Puts: 28 FPS
MappedObject: 51 FPS
NICE! 1.8x speedup! Ever heard of a laptop animating 1 million particles in 50 FPS using Java? xD In a real game you’d probably hit the fill ratio bottleneck of your GPU way before your CPU starts slowing things down at least.
Hi!
Can it be used to improve the performances of 3D scenegraphs, for example those using javax.vecmath or something similar?
Is the CCPL GPL-compatible?
Do you have a typical test case in which we would like to “transfer” a set of values from the CPU to the GPU and vice versa by using your API and any OpenCL binding?
Thank you very much for sharing your source code.
I don’t even know what half of those words mean. ;D
I’ve never used OpenCL before. It just improves performance by completely eliminating puts and gets from a buffer used to send or recieve data from OpenGL or OpenCL. Extremely useful if you have a large amount of data being transferred. Most obvious applications include animation, CPU particle engines, terrain streaming and probably everything you do with OpenCL that requires communication with the CPU each run. And like Riven said before, it’s also more memory efficient.
@ Princec
I just used a MappedObject for the rendering data (position and color, totaling to 12 bytes per particle) and a separate Particle class to store the speed and state of the particle. Works wonders, as mentioned above.
EDIT: Ah, forgot. How the heck do I get rid of the debug output? Takes almost 20 second to start my test because of it. -.-
[quote=“theagentd,post:115,topic:31992”]
There are 2 booleans in the MappedObjectTransformer class (first two lines). Set both to false.
- Did you know that it fails on private/internal MappedObject classes?
{
MappedObjectTransformer.register(Test.Xyz.class);
}
public class Test
{
@MappedType(sizeof = 12)
public static class Xyz extends MappedObject
{
int x, y, z;
}
}
Works fine.
You just have to register it, so it must be public.
This utility is very cool.
Just wondering, is it possible to extend it, or to use a similiar
idea (transforming bytecode / objects stored in native memory)
to implement Structure of Arrays (vs classic OO Array of Structures)
in an elegant way? One of the things that benefit from SoA are
big particle systems where only part of the particle info needs
to be sent to the GPU.
.rex
ps: I’m hiring graphics and tools people:
http://www.linkedin.com/jobs?viewJob=&jobId=1754526
http://www.linkedin.com/jobs?viewJob=&jobId=1754523
http://www.linkedin.com/jobs?viewJob=&jobId=1653942
(I run the Engine team and we use Java + OpenGL)
IMHO: I’d suggest explicitly break-up the data rather than performing runtime weaving for SoA. I’ll let fans of DOP point you to those links.