LWJGL 3 VAO Rendering (Performance Optimizations?)

Feuerrohr · February 11, 2020, 1:05pm

So I’m working on my game engine for about a year. I start from immediate mode to VBO and now to VAO.
My current Renderer works fine and for example i can draw 10.000 Quads (Split in two triangles) and i have 1600-1800 fps with a gtx 1080 and i7 6700K.

I also use JNI. For that i create a method that copies my vertexData into the buffer (gives also a good performance boost).

My current Mesh.java http://paste.myplayplanet.net/sozucacobe.java
So my questions is: Can i optimize more or is that good how it is?

for BufferUtils.copy I use JNI (memcpy)

princec · February 11, 2020, 1:24pm

One thing @theagentd asserted while we’ve been working on our Voxoid engine is that VAOs don’t really bring much value - extra complexity for no particular gain in performance. It seems to have been a bit of an ill-conceived API/concept. We simply bind one at the very startup of our engine and leave it.

Cas

VaTTeRGeR · February 11, 2020, 1:40pm

You could pack the color attribute into one float to save 12 bytes per vertex if you don’t need the precision, but apart from that idk.

h.pernpeintner · February 11, 2020, 2:03pm

I recently implemented programmable vertex pulling and can approve the results in https://github.com/nlguillemot/ProgrammablePulling . While not exactly bringing a performance gain, I find the improvements in API usage and API overhead very nice. Dunno but if your use case also includes multiple VAO,VBOs (for example for multiple entities, formats whatever), then you would also have a performance gain I think

yboya · February 11, 2020, 2:15pm

If you call the method “setVertices()” just once, it is ok, but if you call it often, it can slow down your program as it makes a copy in memory.
Instead of working with a float array for vertices, you could use the class I wrote, FloatMemoryBuffer :

github.com

YvesBoyadjian/Koin3D/blob/master/Koin3D/jscenegraph/src/jscenegraph/port/memorybuffer/FloatMemoryBuffer.java

/**
 * A class representing a zone of memory containing floats
 */
package jscenegraph.port.memorybuffer;

import java.nio.ByteBuffer;
import java.nio.FloatBuffer;

import org.lwjgl.BufferUtils;
import org.lwjgl.system.MemoryUtil;

import jscenegraph.mevis.inventor.misc.SoVBO;

/**
 * @author Yves Boyadjian
 *
 */
public class FloatMemoryBuffer extends MemoryBuffer {
	
	public final static int MINIMUM_FLOATS_FOR_BUFFER = SoVBO.getVertexCountMinLimit() * 3;

This file has been truncated. show original

So there is no more memory copy.

Guerra24 · February 11, 2020, 3:46pm

There are some small changes.

Line 17, new allocated buffers doesn’t need to be flipped.

Line 52, OpenGL methods accept direct array reference so you can remove the FloatBuffer entirely and pass the array into glBufferData.

Line 76, you’re defeating the entire purpose of vaos, glEnableVertexAttribArray state is preserved in the vao so ideally you should enable them once when you create the vao (line 26) and in the render just bind the vao.

Feuerrohr · February 16, 2020, 12:34pm

Ok thanks but why all examples use a buffer when its possible to use a float array directly?

Feuerrohr · February 16, 2020, 12:38pm

And another questions is: How big should I make the buffer?
Because I test it and currently I use 1000 * (count of X, Y, Z, R, G, B, A, U, V) = 9000 [9000 * 4 Bytes -> Float -> 36000]

I test it with higher values but then the fps are 500 and with 1000 i have 1600 fps

Guerra24 · February 16, 2020, 4:32pm

That depends on what are you using to load meshes, usually they are already in *Buffer form and because the performance depends on the JVM if it supports critical natives but since this is a load method and is expected to not run as much you can justify the performance hit. But that’s up to you.

You don’t need to keep the buffer allocated during the execution, you can allocate the buffer in setVertices with size vertices.length, upload the data to OpenGL and then free it. (or pass the array directly)

This is also a bit of memory optimization because you can create a mesh with a single triangle but the buffer is way too large for it so you’re wasting both ram and vram (in the example).

Performance-wise it depends on the driver and gpu, and how it treats buffer that are larger than the requested vertex count so it can vary quite a bit. You should keep them at the same size of the input data.

Feuerrohr · February 16, 2020, 5:36pm

Ok thx

Do you have Discord and I can ask you questions if i had one?

// Rendering
I code a ShapeRenderer like LibGDX and i store all vertices in a array.
Now i dont use a buffer and put the complete array directly with glBufferData.

It increase the fps a bit from 1600 to 1700/1800 (10K quads)

Guerra24 · February 16, 2020, 6:30pm

Yeah Guerra24#9300, as long I’m online you can ask me anything.