VBO causes stuttering

I decided to start playing around with openGL (LWJGL) again after a few years break. This time im actually going to learn how to use VBOs properly, and not use immediate mode like last time. First of all I’d just like to ask for some general advice. Say I have a program with a bunch of objects (or mesh/model whatever you would call it). These objects are kind of static, meaning no animations that would change vertex data. However they may be translated, rotated or scaled. My question is, what is generally the best way to deal with this using VBOs? I can picture two way:

  1. Setting up the VBO with all the object vertex data and not modifying it further. I would have to loop through every object and send a uniform to the shader depending on its position, rotation, and scale, and then render. This way I don’t have to touch the VBO which is nice, but instead i have to perform a glUniform and a glDrawArrays for every object. This is what I use currently.

  2. Modify the data in the VBO directly. If I do it this way I don’t have to do any glUniform calls and I could probably just use one glDrawArrays call, but instead I have to modify the data in the VBO, perhaps many times per frame.

  3. Something else.

I haven’t figured out how to use VBOs properly, so any advice is more than welcome.


Now to the actual problem. Rendering using VBOs causes stuttering. By stuttering I don’t mean lag or fps drop. I can have 5000+ FPS and still stutter. What happens is that occasionally the program freezes for a short time. I’d say around 0.25s on average, but it can vary from maybe 0.1s to 0.5s. These freezes occur maybe once every second or so. It doesn’t seem to matter how many objects I am rendering, the stuttering is still noticeable, even when I’m only rendering 300 vertices. The duration of the freezes seems to increase as I increase the amount of objects I draw, but only a little.

Here’s some code. I removed every thing that is not relevant to this problem.
[spoiler]

import java.nio.FloatBuffer;
import java.util.ArrayList;

import org.lwjgl.BufferUtils;
import org.lwjgl.LWJGLException;
import org.lwjgl.Sys;
import org.lwjgl.input.Keyboard;
import org.lwjgl.opengl.Display;
import org.lwjgl.opengl.DisplayMode;
import org.lwjgl.opengl.GL11;
import org.lwjgl.opengl.GL15;
import org.lwjgl.opengl.GL20;

public class CopyOfMain {

	int shaderProgram;

	int bufferObject;

	PerspectiveCamera camera;
	
	int fps;
	
	float lastFps;
	
	FloatBuffer vertexBuffer;
	
	ArrayList<Cube> cubes = new ArrayList<Cube>();
	

	public void start() {
		try {
			Display.setDisplayMode(new DisplayMode(800,800));
			Display.create();
		} catch (LWJGLException e) {
			e.printStackTrace();
			System.exit(0);
		}

		GL11.glEnable(GL11.GL_DEPTH_TEST);
		GL11.glDepthMask(true);
		GL11.glDepthFunc(GL11.GL_LEQUAL);
		GL11.glDepthRange(0.0f, 1.0f);

		shaderProgram = ShaderLoader.loadShader("res/shader/default.vert", "res/shader/default.frag");
	
		bufferObject = GL15.glGenBuffers();
		
		camera = new PerspectiveCamera(70,Display.getWidth()/(float)Display.getHeight(), 1000, 0.05f);


		vertexBuffer = BufferUtils.createFloatBuffer(100 * 6 * 6 * 12 * 4);
		for (int i = 0; i<100; i++) {
			Cube cube = new Cube();
			
			cube.setScale((float) (Math.random()*2));
			cube.setX((float) (Math.random()*40 -20));
			cube.setY((float) (Math.random()*40 -20));
			cube.setZ((float) (Math.random()*1000 - 998));
			cubes.add(cube);
			
			addToVBO(cube);
		}
			
		vertexBuffer.flip();
		
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, bufferObject);
		GL15.glBufferData(GL15.GL_ARRAY_BUFFER, vertexBuffer, GL15.GL_STATIC_DRAW);
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
		
		GL20.glUseProgram(shaderProgram);
		GL20.glUniformMatrix4(GL20.glGetUniformLocation(shaderProgram, "projectionMatrix"), true, (FloatBuffer)BufferUtils.createFloatBuffer(4*16).put(camera.getProjectionMatrix().getData()).flip());
		GL20.glUniformMatrix4(GL20.glGetUniformLocation(shaderProgram, "modelMatrix"), true, (FloatBuffer)BufferUtils.createFloatBuffer(4*16).put(new Matrix4x4().toIdentity().getData()).flip());
		GL20.glUseProgram(0);

		lastFps = (Sys.getTime() * 1000) / Sys.getTimerResolution(); 
		
		while (!Display.isCloseRequested()) {

			if (Keyboard.isKeyDown(Keyboard.KEY_W)) {
				camera.setY(camera.getY() + 0.01014f);	
			}

			if (Keyboard.isKeyDown(Keyboard.KEY_S)) {
				camera.setY(camera.getY() - 0.01014f);	
			}

			if (Keyboard.isKeyDown(Keyboard.KEY_A)) {
				camera.setX(camera.getX() - 0.01014f);
			}

			if (Keyboard.isKeyDown(Keyboard.KEY_D)) {
				camera.setX(camera.getX() + 0.01014f);
			}

			render();

			updateFPS();

			Display.update();
			//Display.sync(60);
			
		}

		Display.destroy();
	}

	public void addToVBO(Cube cube) {
		
		for (int i = 0; i<cube.getVertices().length; i++) {
		
		vertexBuffer.put(cube.vertices[i].getX());
		vertexBuffer.put(cube.vertices[i].getY());
		vertexBuffer.put(cube.vertices[i].getZ());
		vertexBuffer.put(cube.vertices[i].getU());
		vertexBuffer.put(cube.vertices[i].getV());
		vertexBuffer.put(cube.vertices[i].getNX());
		vertexBuffer.put(cube.vertices[i].getNY());
		vertexBuffer.put(cube.vertices[i].getNZ());
		vertexBuffer.put(cube.vertices[i].getR());
		vertexBuffer.put(cube.vertices[i].getG());
		vertexBuffer.put(cube.vertices[i].getB());
		vertexBuffer.put(cube.vertices[i].getA());
		
		}
		
	}

	public void updateFPS() {
		if ((Sys.getTime() * 1000) / Sys.getTimerResolution() - lastFps > 1000) {
			Display.setTitle("FPS: " + fps);
			fps = 0; 
			lastFps += 1000;
		}
		fps++;
	}

	public void render() {
		GL11.glClear(GL11.GL_COLOR_BUFFER_BIT | GL11.GL_DEPTH_BUFFER_BIT);
		GL11.glClearColor(0.0f, 0.0f, 0.0f, 0.0f);

		GL20.glUseProgram(shaderProgram);
		GL20.glUniformMatrix4(GL20.glGetUniformLocation(shaderProgram, "viewMatrix"), true, (FloatBuffer)BufferUtils.createFloatBuffer(4*16).put(camera.getViewMatrix().getData()).flip());
		
		
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, bufferObject);

		GL20.glEnableVertexAttribArray(0);
		GL20.glVertexAttribPointer(0, 3, GL11.GL_FLOAT, false, 48, 0);
		GL20.glEnableVertexAttribArray(1);
		GL20.glVertexAttribPointer(1, 2, GL11.GL_FLOAT, false, 48, 3*4);
		GL20.glEnableVertexAttribArray(2);
		GL20.glVertexAttribPointer(2, 3, GL11.GL_FLOAT, false, 48, 5*4);
		GL20.glEnableVertexAttribArray(3);
		GL20.glVertexAttribPointer(3, 4, GL11.GL_FLOAT, false, 48, 8*4);
		
		
		for (int i = 0; i<cubes.size(); i++) {
			
			Cube m = cubes.get(i);
			
			GL20.glUniformMatrix4(GL20.glGetUniformLocation(shaderProgram, "modelMatrix"), true, (FloatBuffer)BufferUtils.createFloatBuffer(4*16).put(new Matrix4x4().toIdentity().translate(m.x, m.y, m.z).rotateX(m.pitch).rotateY(m.yaw).rotateZ(m.roll).scale(m.scale).getData()).flip());
			GL11.glDrawArrays(GL11.GL_TRIANGLES, i*36, 36);
			
		}
		
		GL20.glDisableVertexAttribArray(0);
		GL20.glDisableVertexAttribArray(1);
		GL20.glDisableVertexAttribArray(2);
		GL20.glDisableVertexAttribArray(3);
		GL15.glBindBuffer(GL15.GL_ARRAY_BUFFER, 0);
		GL20.glUseProgram(0);
	}

	public static void main(String[] argv) {
		CopyOfMain display = new CopyOfMain();
		display.start();
	}
}

[/spoiler]

Don’t create new buffers every frame.

Standard practice would be to multiply every vertex in the vertex shader by a transformation matrix, instead of directly modifying the VBO.

Here is a skeleton guide to follow

  1. load your data into VBO’s and then bind the VBO’s to a VBA

Do all of this before you draw anything… think of it like the work going on behind the scenes of a load screen.

  1. Load all your shaders (compile and get the shader program ID youll need later)

Your shaders should take in variables for camera matrix, model matrix, projection matrix…and anything else you need. Assume you will dynamically translate and rotate you base vertex points by these matrixes.

  1. OK now draw.
    Set the shader
    Set the active VBA
    Pass in shader variables using Uniforms (camera matrix, model matrix, ect…)
    DrawTriangles (to draw EVERYTHING in 1 line) (use this or the other equivalent draw methods)
    Clean up your points to shader, vba…

repeat for next big thing you want to draw.

Hope this helps
j.

Pretty sure I’m not doing that.

If I understood what you said correctly, then that’s exactly what I am doing (or intending to do, in case I made a mistake somewhere).

My real problem remains though.

You’re generating direct byte buffers and matrices, e.g lots of garbage. I think you’re seeing garbage collection pauses.

You’re absolutely right. I fixed it now and it runs smoothly! I had no idea garbage collection could cause something like this. Thanks a lot! :point:

GC calls can lockup the entire program for milliseconds at a time, I see it the odd time in my Android logcat and it usually goes along the lines of “GC Concurrent something something - 15ms”.

So imagine that, x amount of times per second.

Hi

agentd is right. What is the point of creating a buffer if you use only once? The same is true for the matrix. Both the Java heap and the native heap could run out of memory. Good luck.

I don’t know the details, but direct byte buffers are extra hard for the garbage collector to manage for some reason. You’re also allocating 4x as big buffers as you need. It already takes into account the primitive size, so what you need is createFloatBuffer(16) = createByteBuffer(16*4).asFloatBuffer(). No need to multiply by 4 manually in your case.

Hence:
(100 cubes + 2 extra matrices) X 4164 bytes X say 1000 FPS = ~25MB/sec of direct byte buffers. Owwie.

direct-buffers are a bitch to clean up. (a mapped-buffer keeps its locks until GC’ed! wtf) … anyway, you can get away with icode.cleaner().clean();[/icode] and help GC a bit.

Actually, the garbage collector isn’t called when the JVM runs out of memory on the native heap, it’s unmanaged memory, it’s up to you to “manage” it by manually releasing the native resources when you no longer need them by using basil_'s suggestion (Java < 1.9) or a dedicated API (Java >= 1.9). However, when the JVM runs out of memory on the Java heap, it tries to free some memory on the native heap too.

[quote]I don’t know the details, but direct byte buffers are extra hard for the garbage collector to manage for some reason. You’re also allocating 4x as big buffers as you need. It already takes into account the primitive size, so what you need is createFloatBuffer(16) = createByteBuffer(16*4).asFloatBuffer(). No need to multiply by 4 manually in your case.
[/quote]
Fixed, thank you. Feels like everything else is in amount of bytes, so I guess I just assumed this was in bytes too.