JOGL too slow

On this thread:
http://groups.google.ca/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=lNCmc.372223%24Pk3.90020%40pd7tw1no

Someone suggested that I post my message to this forum.

So, here it is:

Hi guys,

I’ve written an application that makes many OpenGL calls for rendering a scene. (The calls are glColor3f and glVertex3f, mostly.)

when my loops make about 1,400,000 additional calls to glColor3f, render time goes from 50 Milliseconds to 300 Milliseconds!

Straight c (gcc/linux) can do 1,400,000 glColor3f’s, 20 times as fast (same hardware.)

This suggests the jogl->OpenGL bridge is very slow.

I could be wrong, but I think not.

My question is not JOGL specific:

Can you briefly explain how the Java->native bridge is implemented?

How that explains the JOGL slowness?

And if it can be made faster?

I chose Java for it’s seamless platform independence, and fairly decent JIT, but Java->OpenGL speed is crucial.

I am presently looking into wxWindows and wxGLCanvas as a potential alternative.

I really appreciate any help that makes my decision easier.

PS:

I won’t shy away from reading the JOGL source code, if you tell me how it works, and what to do.

cough overhead cough
Try using vertex pointers

And JNI overhead in this case.
You are making mounds of JNI calls per primitive.
The trick is to use vertex arrays/buffers and/or display lists instead even in C.
The batch way of OpenGL-ing is a dying technique.

Thanks for replying.

I know things about my data that display lists don’t. So I can store the data much more efficiently, than a display list probably could.

The tip about vertex arrays is useful to my code, but very limited: vertices are regularly interrupted by glColor().

Triangle strips or fans are also on the menu, where appropriate.

However, I don’t really think any of these improvements are going to make JNI overhead fully negligible. Because of the limitations I described.

Any other tips about OpenGL efficiency might be news to me, and useful.

Thanks again.

[quote]The tip about vertex arrays is useful to my code, but very limited: vertices are regularly interrupted by glColor().
[/quote]
Vertex arrays (or their bigger brothers, VBO’s) are perfectly capable of including vertex colour, normal, etc. data as well as positions.

[quote]However, I don’t really think any of these improvements are going to make JNI overhead fully negligible. Because of the limitations I described.
[/quote]
Practically anything you change is going to run faster. Immediate mode is the slowest way of drawing anything and should only really be used for testing or debugging.

I suppose I could arange things to sort vertices by color.

But what if I have blending requirements down the line. Things would have to be sorted, in the frustum back-to-front order, as well as by color. That might get anywhere from complicated, to inefficient(again), given the data set.

You misunderstand - a vertex colour is just another vertex attribute, like position or texture coords. You just include your vertex colour along with all the other attributes. No sorting is needed.

Hmm… That is not clear from the OpenGL reference page:
http://www.mevis.de/~uwe/opengl/glVertex.html

It only says the current color is applied, so I must still set it when it changes. That is, using an independent glColor3f() call through JNI.

What am I missing? Is this reference page too old? Or am I misreading it?

You need to find yourself a better OpenGL referance list, that one doesn’t even include the vertex array functions in it :S

Check out the Wiki for a breif overview, and grab yourself a copy of the Red book which covers them somewhere in there.

/me needs to find something to do other than crappy MPI work :’(

Your reference page just doesn’t cover vertex arrays.

Take a gander at NeHe’s tutorials. There’s bound to be one there describing vertex arrays.

You guys have been a great help.
Thanks everyone.
Thanks Orangy Tang. Please don’t cry :slight_smile:

Now I just have to ask the Java performance forum if there is a good way to do

malloc(sizeof(struct s) * <very_big_number>)

that is as efficient as C(in RAM and CPU cache.)

For now this is the last burning question in my mind; I suppose :slight_smile:

Use direct buffers, in particular BufferUtils.newByteBuffer(), newIntBuffer(), and newFloatBuffer(). See also the Grand Canyon Demo which demonstrates high dynamic vertex throughput using vertex arrays.