People here think that mapped objects will increase performance of their application. In this thread I want to show you an example, how to make use standard java classes for a decent speed up, which wouldn’t be possible with mapped objects (at least I guess so).
NOTE: This is NO offense to anybody. I just think people are focusing too much on mapped objects. Please proove me wrong for anything I write, that’s what the discussion forum is about!
At first I want to beg you reading my personal definitions of structs and mapped objects I posted on this topic, just to have a common base. sorry for the cross linking :
Ok, here we go.
I wanted to put a real word example for games, so I thought vertices of a triangle-mesh might be a good one, since you have to put them into a Buffer in order to send them to the graphics-card.
Let’s say a vertex consists of a position, a normal, a tangent and a 2D texture coordinate, so we can do normal/parallax mapping - I assume the binormal to be generated in the shader. Further, there has to be a perfomance benifit by using mapped objects, so let’s the the data be dynamic, because we do something like software skinning. However, the texture coodinates are static, as it should for most type of applications.
First, I’ll try to imagine how a Vertex class may look as a MappedObject.
In order to make it simple I just use fields and assume accessing them manipulates or reads from a buffer.
(Since there a lots of different possible implementations described on this board, please comment how to change the following. )
I pure simply approach would be:
class Vertex extends MappedObject {
float posX, posY, posZ;
float normX, normY, normZ;
float tanX, tanY, tanZ;
float texX, texY,;
}
but I guess, it should be possible to make something like this:
class Vector3 extends MappedObject {
float x,y,z;
}
class Vector2 extends MappedObject {
float x,y;
}
class Vertex extends MappedObject {
Vector3 pos, norm, tan;
Vector2 tex;
}
This will end up in either an array of those mapped objects:
Vertex[] vertices;
or like Riven proposed in a StructBuffer:
Vertex vertices = new Vertex(buf);
This is were IMHO opinion the first problem occurs, since on the application side a vertex is a logical unit, but for the graphics-card, you have to split up static und dynamic data. Well, you don’t have to, but I’m sure everyone will agree that sending static data (here the texture coordiantes) every single frame over the bus really decreases performance!
class DynanimcVertex extends MappedObject {
Vector3 pos, norm, tan;
}
DynamicVertex[] dynamicVerts;
Vector2[] staticVerts;
OK, it should be possible to create both from the same underlying ByteBuffer the Vertices are mapped to, but your dynamic data isn’t packed tightly anymore. Altough it is possible to adjust the stride parameter, e.g. for glVertexPointer, OpenGL performs much better with tight data. However, the main problem is that you have to send the whole buffer down to the graphics-cards. The only proper solution is to have 2 Buffer, one for the static and one for the dynamic data.
Now my questions is, can an object be mapped to 2 different buffers?
I guess not, at least maintaining their promised performance.
Summing up, from my understanding there is no way having 2 different Buffers (static and dynamic) and a object mapped to both, here the Vertex-class with its positions, normals and tangents mapped to the first buffer and the texture coordinates mapped to the second buffer. IMHO not having a single Vertex class is somehow ugly code.
So far, I focused on a possible limitation of mapped objects, from now on I’ll try to explain how Java’s standard classes (reference types) can be used to increase performance.
What astonishes me most is that Java guys, still think in a C/C++ manner. I know a low of people complaining that in Java classes, in contrast to C#, can only be reference types.
Comparing to C/C++, an array of a class is an aray of pointers:
Vector3[] vecs = new Vector3[size]; // Java
Vector** pVecs = new (Vector3*)[size]; // C++
Actually, this brought me to an idea, how to optimise my mesh class:
As the graphics guys of you know, some of the vertex data rely on face information, like the normals (smoothing groups/hard edges) or materials (different textures/colors). Therefore you have to split a vertex whenever two neighbouring facess either have
- a hard edge, resulting in a different normals for the same position
- diffrent materials, resulting on different texture coordinates for the some position
since the tangent depends on both, the normals and the texture coordinates, it will be different for the same position if one of the above cases is true.
Most implementations however, don’t use the information that the positions for the duplicated vertices are the same. The same is true for the nomals, splitted by different materials.
Therefore, I use a representation like this:
class Vertex {
Vector3 pos, norm, tan;
Vector2 tex;
}
Vector3[] positions;
Vector3[] nomals;
Vector2[] texcoors;
Vector3[] tangents;
Vertex[] vertices;
since all are reference types (arrays pointers) they point to the same data (e.g., positions[i] == vertex[j].pos, is true for all all vertices j, duplicated accordingly to the face materials and hard-edges)
all vector arrays are duplicate free, which saves you from doing multiple modifications. this means you save modifications of the positions, whenever faces, which refer to a vertex referencing this position, have different face materials or hard edges). Same for the normals…
Of course the benifit depends on the data, actually there is none if a mesh only has a single material and no hard-edges. For my models, position transformations usaully reduce to ~2/3.
You say saving 1/3 isn’t much? keep in mind that for example software skinning usually transforms a position 1-4 times. futher there might be other modifications as well (morph-targets used to simulate muscle contraction,…). With all that you can speed up modification say about a factor of 2 (double number of FPS , if this is your bottle neck :)), which isn’t bad IMHO.
Please tell me if this technique would be possible wih mapped objects? if not I fear of loosing my 2x speedup, by removing the IMHO no so markable copy operation to the buffer.
This is brings me up to my conclusion, which is the bottle neck. I’ll never complaining that copying values to buffer would infect the performance of my applications, as long as I’m not sure whether there are other possible optimizations, which have a greater impact.
Looking forward upon your opinion