VBO Instancing tutorial?

viveleroi · December 8, 2013, 4:24pm

I’m looking for java-specific tutorials or examples on using instancing for VBOs. I’ve found a few poorly documented tutorials for other languages, but they’re less than helpful. I can’t find anything specific for java because of how often the word “instance” is used to mean other things. At the most, all I find is a description without any real examples.

I’ve been building a test app where I render 20x20 chunks of a Minecraft-style game using VBOs. Trying to reach peak performance before I move forward with converting my actual 3d block world game.

theagentd · December 8, 2013, 5:26pm

Instancing isn’t a good fit for cube worlds. What you want is to cull specific faces, not cubes. Are you sure instancing is the best solution?

Instancing is the process of rendering multiple “instances” of a single 3D model, allowing you to render for example 100 exactly similar trees in one draw call.

viveleroi · December 8, 2013, 5:44pm

I see, ya that makes sense.

I started a few weeks ago on a 3d block world game, and in that I’m using display lists to cache the faces of exposed blocks in each chunk. This method works extremely well, in fact the only performance problem is some noise calculations.

However, I posted on stackoverflow because I’ve had an impossible time getting my block-picking ray casting working. It’s always half a block off. Anyway, so many people have been recommending that I use VBOs instead of display lists but so far I’m still not convinced there’s a big benefit. I’ve written an experimental app where I can play with VBO rendering and I’ve gone through a lot of learning trying to get them to perform well - I finally had things running adequately with 10x10 chunks of 16x16x16 blocks - but then when I try to bump it up to 20x20 chunks, the fps essentially slows to a crawl. I need to update this testing app with the same face culling logic the real app uses, because until I am really culling all faces, it’s not a good performance measure.

ags1 · December 8, 2013, 6:51pm

I think VBOs are architecturally better than display lists (after all, they superseded display lists) but that does not mean they are faster. Display lists can be the fastest option sometimes.

viveleroi · December 8, 2013, 7:35pm

After improving my block face culling in my test app, the rendering is MUCH better - enough so that the VBOs perform just as well as the display lists I currently use.

However there’s still a full performance death when I render 16x16 chunks (each chunk as 16x16x16 blocks) instead of 15x16.

I don’t understand why one extra row of chunks kills performance. I see nothing that explains it when I profile with visualvm, etc.

I have posted about this on stack overflow (with links to code)

theagentd · December 8, 2013, 8:19pm

We’ll need to see code for how you render those chunks then. I assume you only update the chunks when they change?

Although they may be faster on some hardware, display lists aren’t equally fast on all hardware. VBO performance is a bit less hardware specific.

viveleroi · December 8, 2013, 10:24pm

The code for the test app is open source. Although it’s extremely simple compared to my game, it’s meant to let me have a performance and code organization benchmark, something I can settle on and be happy with before updating my game.

The specific code that builds each chunk:

github.com

helion3/opengl/blob/master/src/main/java/com/helion3/opengl/shapes/Chunk.java

package com.helion3.opengl.shapes;

import com.helion3.opengl.rendering.TextureQuadRenderer;


public class Chunk {
	
	public static final int ROWS = 16;
	public static final int COLUMNS = 16;
	public static final int HEIGHT = 16;
	
	private int bufferSize = 192 * ROWS * COLUMNS * HEIGHT; // 192 per block * chunk dimensions
	private TextureQuadRenderer quadTesselator = new TextureQuadRenderer(bufferSize);
	
	private byte[][][] blocks = new byte[ROWS][COLUMNS][HEIGHT];
	private int chunkX;
	private int chunkZ;
	
	
	/**

This file has been truncated. show original

The VBO portion the chunk actually renders with:

github.com

helion3/opengl/blob/master/src/main/java/com/helion3/opengl/rendering/TextureQuadRenderer.java

package com.helion3.opengl.rendering;

import java.nio.FloatBuffer;
import java.nio.IntBuffer;

import org.lwjgl.BufferUtils;

import static org.lwjgl.opengl.ARBBufferObject.*;
import static org.lwjgl.opengl.ARBVertexBufferObject.*;
import static org.lwjgl.opengl.GL11.*;

public class TextureQuadRenderer {
	
	private FloatBuffer interleavedBuffer;
	private IntBuffer ib = BufferUtils.createIntBuffer(1);
    protected int vertexCount = 0;
    protected int bufferId = -1;
    
    
    /**

This file has been truncated. show original

There are some additional performance changes I want to make, mainly using indexed vertices instead since so may are duplicates, but I’m still encountering such poor performance that I’m not going to switch my game to this. However, since people tend to say VBO performance is excellent, I’m wondering what I’ve done wrong.