[LWJGL] Asynchronous batch renderer?

Hi!

I’m trying to create a renderer using LWJGL. It uses glMapBufferRange( … GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT)
and 3 VBOs picked with round robin. I managed to get some okay FPS but I have troubles with textures.
My shader has a sampler2D[32] array so I need to flush when I have more than 32 textures.

If I have less than 32 it’s fine becuase then I can just put everything into one buffer and draw that.
Else I have to unmap the buffer draw all the current elements in the buffer then begin a new drawcall.
Problem is when I begin again I still have let’s say half of the data I wanted to push into to the buffer I just unmaped.
If I want to call the glMapBufferRange() on the same buffer again the program crashes.

I hope I managed to explain it well enough for s’one to help me.
The technique I tried to implement can be found here if you didn’t get my post. (the triple buffering one)


(I can’t post code atm because I left it on my other PC)

The whole point of persistent mapping is to map stuff once and then leave it mapped forever, so you’re doing things a bit wrong so far. Regardless, the driver shouldn’t crash from you mapping a buffer twice in a row. I’m not entirely sure what’s happening there. Care to show any source?

I’m sure I’m messing up s’thing basic. This is sort of a ‘learn OpenGL’ project for me.
The submit() method takes a Drawable2D argument wich is basically a TexturedSprite in this case.

public class BatchRenderer implements Renderer2D {
	/** Maximum number of sprites. */
	private static final int RENDERER_MAX_SPRITES = 60000;
	/** Combined vertex attribute size in bytes. */
	private static final int RENDERER_VERTEX_SIZE = Constants.VERTEX_SIZE_IN_BYTES;
	/** Total size of a sprite in bytes. */
	private static final int RENDERER_SPRITE_SIZE = RENDERER_VERTEX_SIZE * 4;
	/** Total size of the buffer in bytes. */
	private static final int RENDERER_BUFFER_SIZE = RENDERER_MAX_SPRITES * RENDERER_SPRITE_SIZE;
	/** Number of indices required. */
	private static final int RENDERER_INDEX_SIZE = RENDERER_MAX_SPRITES * 6 * 3;
	/** Maximum number of texture slots allowed. */
	private static final int MAX_TEXTURE_SLOTS = 31;

	private int vao;
	private int vbo;
	private IndexBuffer ibo;
	private int indexCount;
	private FloatBuffer buffer;
	private List<Integer> textureSlots;
	private float[] tmp = new float[7 * 4];

	public BatchRenderer() {
		init();
		textureSlots = new ArrayList<Integer>(0);
	}

	private static final int VBO_NUM = 3;
	private int[] vbos = new int[VBO_NUM];
	private boolean[] bound = new boolean[VBO_NUM];
	private int testVao;
	private int currentBuffer = -1;
//	private int offset = 0;
	private boolean sync = true;

	private void init() {
		for(int i = 0; i < VBO_NUM; i++)
			bound[i] = false;
		vao = glGenVertexArrays();
		vbo = glGenBuffers();

		glBindVertexArray(vao);
		glBindBuffer(GL_ARRAY_BUFFER, vbo);
		glBufferData(GL_ARRAY_BUFFER, RENDERER_BUFFER_SIZE, null, GL_DYNAMIC_DRAW);

		// Enabling the attribute locations
		glEnableVertexAttribArray(Shader.A_POSITION);
		glEnableVertexAttribArray(Shader.A_TEX_COORD);
		glEnableVertexAttribArray(Shader.A_TEXTURE);
		glEnableVertexAttribArray(Shader.A_COLOR);

		// Creating the attributes
		// Vertex position
		glVertexAttribPointer(Shader.A_POSITION, 3, GL_FLOAT, false, RENDERER_VERTEX_SIZE, 0);
		// Texture coordinate
		glVertexAttribPointer(Shader.A_TEX_COORD, 2, GL_FLOAT, false, RENDERER_VERTEX_SIZE, POSITION_IN_BYTES);
		// Texture ID
		glVertexAttribPointer(Shader.A_TEXTURE, 1, GL_FLOAT, false, RENDERER_VERTEX_SIZE, (POSITION_IN_BYTES + TEX_COORD_IN_BYTES));
		// Vertex color
		glVertexAttribPointer(Shader.A_COLOR, 4, GL_UNSIGNED_BYTE, true, RENDERER_VERTEX_SIZE, (POSITION_IN_BYTES + TEX_COORD_IN_BYTES + TEXTURE_IN_BYTES));

		glBindBuffer(GL_ARRAY_BUFFER, 0);

		glBindVertexArray(0);

		// Generating the indices
		int offset = 0;
		final int[] indices = new int[RENDERER_INDEX_SIZE];
		for (int i = 0; i < RENDERER_INDEX_SIZE; i += 6) {
			indices[i + 0] = offset + 0;
			indices[i + 1] = offset + 1;
			indices[i + 2] = offset + 2;

			indices[i + 3] = offset + 2;
			indices[i + 4] = offset + 3;
			indices[i + 5] = offset + 0;

			offset += 4;
		}

		ibo = new IndexBuffer(indices);

		// Test
		for(int i = 0; i < VBO_NUM; i++)
			vbos[i] = glGenBuffers();
		testVao = glGenVertexArrays();
		glBindVertexArray(testVao);
		for (int i = 0; i < VBO_NUM; i++) {
			glBindBuffer(GL_ARRAY_BUFFER, vbos[i]);
			glBufferData(GL_ARRAY_BUFFER, RENDERER_BUFFER_SIZE, null, GL_DYNAMIC_DRAW);

			// Enabling the attribute locations
			glEnableVertexAttribArray(Shader.A_POSITION);
			glEnableVertexAttribArray(Shader.A_TEX_COORD);
			glEnableVertexAttribArray(Shader.A_TEXTURE);
			glEnableVertexAttribArray(Shader.A_COLOR);

			// Creating the attributes
			// Vertex position
			glVertexAttribPointer(Shader.A_POSITION, 3, GL_FLOAT, false, RENDERER_VERTEX_SIZE, 0);
			// Texture coordinate
			glVertexAttribPointer(Shader.A_TEX_COORD, 2, GL_FLOAT, false, RENDERER_VERTEX_SIZE, POSITION_IN_BYTES);
			// Texture ID
			glVertexAttribPointer(Shader.A_TEXTURE, 1, GL_FLOAT, false, RENDERER_VERTEX_SIZE, (POSITION_IN_BYTES + TEX_COORD_IN_BYTES));
			// Vertex color
			glVertexAttribPointer(Shader.A_COLOR, 4, GL_UNSIGNED_BYTE, true, RENDERER_VERTEX_SIZE, (POSITION_IN_BYTES + TEX_COORD_IN_BYTES + TEXTURE_IN_BYTES));

			glBindBuffer(GL_ARRAY_BUFFER, 0);
		}
		glBindVertexArray(0);

	}

	public void nextFrame() {
//		offset = 0;
		currentBuffer++;
		currentBuffer = currentBuffer == VBO_NUM ? 0 : currentBuffer;
	}

	private void updateBoundVBOs() {
		for (int i = 0; i < VBO_NUM; i++) {
			if (i == currentBuffer)
				bound[i] = true;
			else
				bound[i] = false;
		}
	}

	public void begin() {
		if(sync){
			glBindBuffer(GL_ARRAY_BUFFER, vbos[currentBuffer]);
			if(!bound[currentBuffer])
				buffer = glMapBufferRange(GL_ARRAY_BUFFER, 0, RENDERER_BUFFER_SIZE, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT).order(ByteOrder.nativeOrder()).asFloatBuffer();
			else
				;// ??
		}
		else{
			glBindBuffer(GL_ARRAY_BUFFER, vbo);
			buffer = glMapBufferRange(GL_ARRAY_BUFFER, 0, RENDERER_BUFFER_SIZE, GL_MAP_WRITE_BIT).order(ByteOrder.nativeOrder()).asFloatBuffer();
		}
		updateBoundVBOs();
	}

	@Override
	public void submit(Drawable2D drawable) {
		int texID = drawable.getTexID();

		float ts = 0.0f;

		// The sampler unit's index
		if (texID > 0) {
			boolean found = false;
			for (int i = 0; i < textureSlots.size(); i++) {
				if (textureSlots.get(i) == texID) {
					ts = (float) (i + 1);
					found = true;
					break;
				}
			}
			// If the textureID is not in the array, draw the current buffer and start a new drawcall
			if (!found) {
				if (textureSlots.size() >= MAX_TEXTURE_SLOTS) {
					end();
					flush();
					begin();
				}
				textureSlots.add(texID);
				ts = (float) (textureSlots.size());
			}
		}

		int idx = 0;
		// 0,0
		tmp[idx++] = drawable.getPosition().x;
		tmp[idx++] = drawable.getPosition().y;
		tmp[idx++] = drawable.getPosition().z;
		tmp[idx++] = drawable.getU1();
		tmp[idx++] = drawable.getV1();
		tmp[idx++] = ts;
		tmp[idx++] = drawable.getFloatBitColor();

		// 0,1
		tmp[idx++] = drawable.getPosition().x;
		tmp[idx++] = drawable.getPosition().y + drawable.getSize().y;
		tmp[idx++] = drawable.getPosition().z;
		tmp[idx++] = drawable.getU2();
		tmp[idx++] = drawable.getV2();
		tmp[idx++] = ts;
		tmp[idx++] = drawable.getFloatBitColor();

		// 1,1
		tmp[idx++] = drawable.getPosition().x + drawable.getSize().x;
		tmp[idx++] = drawable.getPosition().y + drawable.getSize().y;
		tmp[idx++] = drawable.getPosition().z;
		tmp[idx++] = drawable.getU3();
		tmp[idx++] = drawable.getV3();
		tmp[idx++] = ts;
		tmp[idx++] = drawable.getFloatBitColor();

		// 1,0
		tmp[idx++] = drawable.getPosition().x + drawable.getSize().x;
		tmp[idx++] = drawable.getPosition().y;
		tmp[idx++] = drawable.getPosition().z;
		tmp[idx++] = drawable.getU4();
		tmp[idx++] = drawable.getV4();
		tmp[idx++] = ts;
		tmp[idx++] = drawable.getFloatBitColor();

		buffer.put(tmp);

		indexCount += 6;
	}

	public void end() {
//		offset = buffer.position() + 1;
		glUnmapBuffer(GL_ARRAY_BUFFER);
		glBindBuffer(GL_ARRAY_BUFFER, 0);
	}

	@Override
	public void flush() {
		for (int i = 0; i < textureSlots.size(); i++) {
			glActiveTexture(GL_TEXTURE0 + i);
			glBindTexture(GL_TEXTURE_2D, textureSlots.get(i));
		}
		if (sync)
			glBindVertexArray(testVao);
		else
			glBindVertexArray(vao);
		ibo.bind();

		glDrawElements(GL_TRIANGLES, indexCount, GL_UNSIGNED_INT, 0);

		ibo.unbind();
		glBindVertexArray(0);

		indexCount = 0;
		textureSlots.clear();
	}
}

So could anyone help me with this?

You should consider decoupling filling your VBO from performing draw calls and state changes. Filling a VBO is about putting vertex data into a buffer. Changing state and drawing are about how to interprete this vertex data. There is no need to stop filling your VBO the moment there is a state change.

Your are currently doing something along these lines

  1. change state
  2. fill VBO
  3. draw enitre VBO
  4. while more to render: goto #1

What you could be doing instead:

  1. while more geometry to render:
    A. append to VBO
    B. store future state change for sub-range of VBO
  2. while more scheduled state changes
    A. apply state change
    B. draw sub-range of VBO

Using this approach you worked around the whole problem of when to map/unmap or switch VBOs. Keep in mind VBOs are just an abstraction layer over memory. If you create one massive VBO and use it for everything you render, you’re making life a tad easier when doing more complex things that these abstraction layers weren’t really meant to address.

Thanks for the answer Rive. Unfortunately, I’m not too experienced with OpenGL so could you please tell me how your idea actually work, or point me where I can learn about it? (How to store state changes and render them after eachother)

You don’t store state changes. You simply first upload all the data to different buffers, then do all the state change and draw calls afterwards.

You ‘store’ them as regular objects, or whatever you prefer, anything that allows you to make the appropriate gl-calls in between draw-calls.