Manipulating FloatBuffers

since i want to use vertex arrays for animations (md2, interpolating between keyframes), and jogl’s vertex arrays only accept FloatBuffers instead of floatarrays, i need to manipulate the FloatBuffers data.

with gl4java, i had no problem with this method - i could easily calculate the vertexdata and put it directly into the already existing floatarray, which was faster than lots gl*-commands.

i don’t want to use the put-method, because this could kill the advantage i’d had by using vertex arrays.
so how to change the FloatBuffers data ?

did really no one uses vertex arrays for animations ?

What’s wrong with put()? That’s what you’re meant to use!

Cas :slight_smile:

did you look at the sourcecode ?
put(float[])
puts every value into theFloatBuffer by using a for-loop.
it doesn’t use the array i give it as its backing array.

this is a waste of time - i have to calculate the vertex data, and then copy it using a for loop.
why can’t i put it directly into the float[], that is backing the buffer up ?

except, of course, the get and set-methods of a floatbuffer are as fast as the array access.
i wrote a simple class with an int, a setter, and a getter.
the direct access was WAY faster (3-4 times if i remebr correctly) than the access by getX() and setX()

Jogl uses direct buffers because these are not shifted around by the garbage collector. Direct buffers don’t use a backing array (at least not one that is accessible from the java side), so what you are asking for isn’t possible.
It is possible to create Buffer wrappers around arrays using the ByteBuffer.wrap, FloatBuffer.wrap, … but you won’t be able to pass these buffers to jogl since it checks whether the buffers are direct or not.

In other words - abstract out all your float array access and replace all array access with put() and get() on a direct float buffer. Problem solved!

Cas :slight_smile:

is there an alternative (writing my own implementation of a floatbuffer ?) ?

if not : what is the fastest way to replace the buffered data ?

put every value per put myself, or clear the buffer, and then put(float[]) ?

:frowning: i hate this encapsuling.

sun is slowing java down a lot this way. i wrote my own implementation of a List, and it’s twice as fast as the arraylist, no matter if you add, set or get an object.
you don’t even have to cast the elements if you iterate through the array ;D used a little hack there, but here, i get a 300% speed boost without breaking the rules of the list interface.
and guess what ? it’s even able to refuse objects that are no instances of class xyz for free…no extra checking…

It would be interesting to see implementation of this List thing. There are few not-so-obvious traps with writing your own collection classes and they do cost a perfomance when you add a code to avoid them.

Hamster, stop dealing with float[]s in the first place. If you don’t want to copy data, don’t copy data. Write directly to AGP RAM where you want it to go in the first place.

Cas :slight_smile:

Hi,

first the reason, why I think that JoGL uses FloatBuffers (and not float[]):
With Java 1.4, there is now a JNI (the interface that allows to call C function from Java, e.g. the GL-Calls) method, that allows the C function to get a pointer to a Buffer-Object. This way, there is no copying around when you pass the FloatBuffer to the native glVertexPointer.
This of course only works with FloatBuffers that are direct and thus have no backing float[]

Concerning your problem I see two ways:

  1. put them in the FloatBuffer with put(float[]). I haven’t looked at the code, but if you say, there’s just a for loop in there, it won’t be slower than writing the for loop yourself.
    And if Sun decides to somehow optimize this (I couldn’t think about any way right now, but…), you will get that optimisation for free.

  2. use a vertex shader to do the interpolation. I haven’t done it, yet, but you should be able to pass both positions to the vertex-shader as varying parameters plus a weighting between the positions that changes over time.
    This also allow you to use static Vertex Buffer Objects, so all the data stays on the Graphics card and you don’t have the AGP as a bottleneck.
    Since Vertex-Shaders are supported starting from Geforce 1/Radeon 8500, the hardware-requirements don’t get too much out of hand.

Personally, I think solution 2 is both faster and cleaner, unless you also want to target older cards.

Jan

now this sounds interesting.
since i never worked with vertex shaders yet (i thought they are used to render shadows?), do you have a sample code or tutorial ?

[quote]It would be interesting to see implementation of this List thing. There are few not-so-obvious traps with writing your own collection classes and they do cost a perfomance when you add a code to avoid them.
[/quote]
i’ll post it in ~6 hours.
its only disadvantage so far is that is has no iterator, but since you can iterate through per for-loop, this doesn’t really matter.

Sorry, I don’t have any sample-code, as I’m pretty new to shaders as well. I would suggest using Cg (or GLSL) if you don’t want the assembler-hassle.
On nehe.gamedev.net, there is a simple vertex shader tutorial that should get you started together with the Cg-Documentation from NVIDIA.
The theory for the keyframe interpolation is as follows:

The vertex shader is executed per vertex and has to output a transformed vertex (and lit, if you want lighting, so you have to implement the lighting equation for yourself, but you should find info for that both on NVIDIAs and ATIs developer pages).

For each vertex, you can give several parameters: e.g. position, normal, color,… You can also have custom parameters. So in these custom parameters, you pass a second position (and normal) for the second keyframe.

Also, you can set parameters that are the same for all vertices. This would be your weighting of the two keyframes. In your vertex shader, you just linearly interpolate the two positions (and normals) using the weight you have.
This gives you the interpolated position with which you can then work (multiplying by model-view-matrix, calculating lighting, etc).

As I haven’t done much with shaders, yet, maybe someone who has already played around a bit more can give some comment, etc.

Jan

FastVector postet at “shared something”-section

Maybe I’m confused, but if you only want to change a single float value at a time why not just use: put(int index, float f) ?

for animations, i have to change every float.
so i either have to do a lot of puts (lots of method calls + java = bad), or do a lot of calculations, save them in a float array and then copy all floats via put(float[]), which means saving every float value 2 times. bad.

You are entirely mistaken. Lots of method calls + java == very good, better than C++ usually, although in this case you have the overhead of a bounds check on each invocation of put().

  1. the client VM inlines the put() method.
  2. the sever VM actually hoists the bounds checks out if it can, and replaces put() with a single asm instruction.

It’s fast enough. Sounds like you’re complaining before you’ve even determined if there is actually a performance bottleneck?

Cas :slight_smile:

client VM ?
server VM ?

which one should i use, and how to choose ?
any why are there 2 types of vm ?

however, looks like the jit does some optimization that i didn’t know about…but the interpolation using vertex shaders still shouls be faster ;D

i’ll run a benchmark using put(float) and put(float[]) and see which method is the best.


yes, i’m complaining before there is a bottleneck. but there will be, since i’m going to render a LOT of animated models using keyframe interpolation.

There’s two types of VM because the Sun engineers haven’t got off their collective lazy arses and merged them into one VM with two-stage compilation, that’s why 8) Right now, the client VM does very little useful optimisation, and starts up pretty fast, but it’s generally about 20-50% slower than the server VM for a lot of operations. In a game this will end up amounting to maybe a 5-10% reduction in frame rate.

The server VM does a whole bunch of cleverer optimisations - the ones that people mistakenly say is why Java is so much faster than C++ and then go on to use the crap client VM - but it takes (age_of_universe + 1) to start your game. (I released my game with the server VM at first and had a few people complaining that it had “hung” when in fact it was just churning through compilation at startup).

If you’re really doing serious stuff the server VM is the way to go, and you’ll just have to put up with the slow startup time or you won’t get the performance you want. In the end though this brings me to my final point…

put(float[]) is always going to be much faster but that involves you doing all your manipulation on the float array first. This is actually much less of a deal than you think. If you’re doing any serious poly pushing you need to use AGP RAM anyway - and you can’t read from AGP RAM (unless you want to run at 10fps :stuck_out_tongue: ). The server VM, I think, optimises put(float[]) especially as well.


JeffK will hopefully come along soon to back me up and rant at you, but you’re making a Classic Programmers Mistake when you say this, and if you don’t listen to us now, you’ll only find out later!

Don’t do this kind of optimisation until you have finished the unit you are working on and profiled it for bottlenecks. You will almost certainly find that this is not a bottleneck. You will almost certainly find that your OpenGL drivers are going to be the bottleneck. If you try and do clever stuff to speed it up now you’ll end up with a mess that doesn’t work very well.

Cas :slight_smile:

DISCLAIMER : I am just beginning OpenGL and I don’t have experienced animation yet. Therefore this post may be worthless :slight_smile:

I can’t see why you interpolate vertex data in software.

The way I understand keyframed animation, you only need to interpolate the transforms being applied to your models, then you apply those new transforms using standard OpenGL calls. That way, you can store your data in VBO/VAR and have the GPU do almost all the stuff.

Did I missed something ?