Hi everybody.
I’m a bit stumped. It seems as if my program would resend the contents of my VBO each frame.
Specs:
AsRock M3N78D, nForce 720D, AM3
AMD Phenom II X6 1100T Black Edition, 3.3GHz
Radeon HD 6950 2GB, PCI-E x16 2.0, DP
Software:
Updated drivers (CPU/Chipset as well as GPU)
JOGL 2.0 RC5
Eclipse (shouldn’t matter, though)
Program setup:
FPSAnimator is supposed to call display() as often as possible (up to 1000 FPS).
Game uses a Octree, and Occlusion Culling as well as Frustum Culling are implemented. (On said Octree)
All to-be-rendered triangles are stored in a VBO
What I’d like to do:
Render as many Triangles as possible. (just for starters) Right now I render Cubes.
Benchmarking:
Speed is directly influenced by the size of the VBO.
3 072 MB: I allocate using integer (32 Bit), so I get an overflow (=crash).
1 536 MB: 1 FPS (I get graphical errors, such as lines across the entire screen. )
768 MB: 2 FPS (From here on all looks nice)
384 MB: 4 FPS
192 MB: 8 FPS
96 MB: 16 FPS
48 MB: 32 FPS
24 MB: 60 FPS
12 MB: 118 FPS
6 MB: 200 FPS
3 MB: 350 FPS
1.5 MB: 510 FPS
750KB: 810 FPS
The VBO is constructed ONCE and then no longer updated (I disabled updating for now).
Clearly, this stuff should run faster. Could you have a look and maybe see a bug I’ve overlooked? Maybe something simple?
I mean, 24 MB = 360k Triangles surely isn’t anywhere the limit my Hardware has
My rendering Code looks like this:
public void display(GLAutoDrawable drawable) {
Now = System.nanoTime();
MSSinceLastFrame = (double) (Now - LastCall) / 1000000;
LastCall = Now;
System.out.println("MS since last call: " + MSSinceLastFrame + " FPS: " + 1000/MSSinceLastFrame);
do_look(); //Adjusts lookatx, lookaty, lookatz
do_move(MSSinceLastFrame); //Adjusts posx, posy, posz
GL2 currGL = drawable.getGL().getGL2();
currGL.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT); //Reset
currGL.glLoadIdentity();
glu.gluLookAt(posx, posy, posz, lookatx, lookaty, lookatz, 0, 1, 0); //Player position and direction
currGL.glBindBuffer(GL.GL_ARRAY_BUFFER, vbo_handle);
currGL.glEnableClientState(GL2.GL_VERTEX_ARRAY);
currGL.glEnableClientState(GL2.GL_TEXTURE_COORD_ARRAY);
currGL.glEnable(GL.GL_TEXTURE_2D);
currGL.glBindTexture(GL.GL_TEXTURE_2D, texture.getTextureObject(currGL));
currGL.glVertexPointer(3, GL.GL_FLOAT, 5 * 4, 0); //Each Vertex has 3 Coords...
currGL.glTexCoordPointer(2, GL.GL_FLOAT, 5 * 4, 3 * 4); //... and 2 Texture Coordinates, packed interleaving
currGL.glDrawArrays(GL.GL_TRIANGLES, 0, 4 * Buffer.capacity()); //Render the whole Buffer
currGL.glDisableClientState(GL2.GL_VERTEX_ARRAY);
currGL.glDisableClientState(GL2.GL_TEXTURE_COORD_ARRAY);
currGL.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);
currGL.glBindTexture(GL.GL_TEXTURE_2D, 0);
double tmp = System.nanoTime();
double mspassedwhilerendering = (tmp - LastCall) / 1000000;
System.out.println("MS Passed on CPU: " + mspassedwhilerendering + " FPS (CPU): " + 1000/mspassedwhilerendering);
currGL.glFinish();
drawable.swapBuffers();
This is how I initialize things:
@Override
public void init(GLAutoDrawable drawable) {
drawable.setAutoSwapBufferMode(false);
GL2 gl = drawable.getGL().getGL2();
glu = new GLU();
gl.glClearColor(0.0f, 0.0f, 0.0f, 0.0f);
gl.glClearDepth(1.0f);
gl.glShadeModel(GL2.GL_SMOOTH);
gl.glEnable(GL.GL_DEPTH_TEST);
gl.glDepthFunc(GL.GL_LEQUAL);
gl.glEnable(GL.GL_TEXTURE_2D);
// reshape
glu.gluLookAt(posx, posy, posz, lookatx, lookaty, lookatz, 0, 1, 0);
glu.gluPerspective(45.0, SCREEN_WIDTH / SCREEN_HEIGHT, 1, 100);
// fov, aspect ratio, near & far clipping planes
if (vbo_handle <= 0) {
int[] tmp = new int[1];
gl.glGenBuffers(1, tmp, 0);
vbo_handle = tmp[0];
}
Buffer = Buffers.newDirectFloatBuffer(90*65536); //==65k Cubes == 6M Floats == 24 MBytes
int numBytes = Buffer.capacity() * 4;// Allocate the Buffer (Data is set on a per-Octree-Leaf basis!)
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, vbo_handle);
gl.glBufferData(GL.GL_ARRAY_BUFFER, numBytes, null, GL.GL_DYNAMIC_DRAW);
gl.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);
}
And for each Leaf in my Octree, I do this (just to be clear: This is done exactly once per leaf, and never repeated)
int numBytes = 30 * 3 * cubes * 4; //cubes < 512 in every case (this is performed in the Leaf of the Octree)
WWV.gl.glBindBuffer(GL.GL_ARRAY_BUFFER, WWV.vbo_handle);
WWV.gl.glBufferSubData(GL.GL_ARRAY_BUFFER, myPos * 4, numBytes, WWV.Buffer);
WWV.gl.glBindBuffer(GL.GL_ARRAY_BUFFER, 0);
Do you see anything wrong? Or is there an example which I could look at?
All tips/suggestions welcome ;), and thanks for the help.