Greetings JGO Community,
I’m stuck on a particular problem regarding glReadPixels and OpenGL ES.
I’m having an unusual problem while working on an openGL project. Essentially I require frame data in GRAYSCALE single channel format for some CV stuff. I’m using a custom shader, an FBO and PBO’s to get the task done. The data I’m rendering is the camera view from Android.
The flow of the program is as follows.
- bind the generated FBO
- draw() to the FBO
- bind PBO and glReadPixels()
- bind PBO from previous frame and glMapBufferRange()
- process the provided pixel data from glMapBufferRange()
I’d like to actually confirm that the process is working fine. What i’d like to know is whether there is anything that can be done to increase the performance. I’m going to post some of the code I’m using so we can all follow.
The PBO generator code
public void setupPBO() {
final int[] pbuffers = new int[2];
GLES30.glGenBuffers(2, pbuffers, 0);
for (int i = 0; i < pbuffers.length; i++) {
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbuffers[i]);
GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, width * height, null, GLES30.GL_DYNAMIC_READ);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
}
pbo_id[PBO_PRIMARY_ID] = pbuffers[0];
pbo_id[PBO_SECONDARY_ID] = pbuffers[1];
}
The PBO Bind/Read Code. Before this call is made, I bind the FBO which was rendered into from the previous frame.
public void bindReadSwapPBO() {
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[currentBuffer]);
GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
// glReadPixels is done from the JNI layer. Only read single channel GL_RED
// This blocks for up to 50ms. Should be an Async call?
JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
final int prevBuffer = previousBuffer;
previousBuffer = currentBuffer;
currentBuffer = prevBuffer;
}
This code is what handles grabbing the data from the PBO. Can confirm that this works properly and the call is virtually 0ms.
public void bindMapPBO() {
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[currentBuffer]);
// read our data from the PBO.
JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
}
And this is where the performance problem is coming from. Currently I’m reading back pixels which are 480 x 360 single channel grayscale (calculated from a shader). I’ve ran some benchmarks and results are below.
- 40-50ms -> JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
- 0-1ms -> JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
From what I understood is that glReadPixels from the PBO is not meant to be a blocking call, but for whatever reason it’s blocking it here (and performing far worse than just reading from an FBO). It seems glMapBufferRange is behaving as expected, and returning the required data properly.
The only thing i can think of is that I’m using GL_RED and only reading back a single channel, but this still doesn’t explain why glReadPixels is blocking.
Devices I’ve used for bench-marking (consistent behaviour).
- HTC One M8s (40-50ms)
- Nexus 5x (20-30ms)
- Google Pixel (15-30ms)
public void setupFBO() {
final int[] values = new int[1];
GLES30.glGenTextures(1, values, 0);
GLES30.glBindTexture(GLES30.GL_TEXTURE_2D, values[0]);
// we only want GRAYSCALE / Single channel texture
GLES30.glTexImage2D(GLES30.GL_TEXTURE_2D, 0, GLES30.GL_R8, texWidth, texHeight, 0, GLES30.GL_RED, GLES30.GL_UNSIGNED_BYTE, null);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_WRAP_S, GLES30.GL_CLAMP_TO_EDGE);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_WRAP_T, GLES30.GL_CLAMP_TO_EDGE);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_MIN_FILTER, GLES30.GL_NEAREST);
GLES30.glTexParameteri(GLES30.GL_TEXTURE_2D, GLES30.GL_TEXTURE_MAG_FILTER, GLES30.GL_NEAREST);
this.tex_id[0] = values[0];
GLES30.glGenFramebuffers(1, values, 0);
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, values[0]);
this.fbo_id[0] = values[0];
GLES30.glFramebufferTexture2D(GLES30.GL_FRAMEBUFFER, GLES30.GL_COLOR_ATTACHMENT0, GLES30.GL_TEXTURE_2D, this.tex_id[0], 0);
final int status = GLES30.glCheckFramebufferStatus(GLES30.GL_FRAMEBUFFER);
if (status != GLES30.GL_FRAMEBUFFER_COMPLETE) {
Debug.LogError("Framebuffer incomplete. Status: " + status);
}
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, 0);
}
The full render code. I’ve deconstructed as much of the logic and flow as possible for clarity.
// bind the offscreen FBO and render the current camera frame
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, dualFBO.getID());
camera.draw(ShaderType.GRAYSCALE);
// ping-pong the FBO ID's
dualFBO.swap();
// dualFBO will now return the ID for last frame
GLES30.glBindFramebuffer(GLES30.GL_FRAMEBUFFER, dualFBO.getID());
// bind the current PB and submit (meant to be async) glReadPixels
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, dualPBO.getID());
GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
// this call locks for 30-50ms... why? (meant to be async???)
JNI.glReadPixels(0, 0, width, height, GL_RED, GL_UNSIGNED_BYTE, 0);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
// ping-pong the PBO ID's.
dualPBO.swap();
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, dualPBO.getID());
// this call is instant
JNI.glMapBufferRange(GL_PIXEL_PACK_BUFFER, 0, width * height, GL_MAP_READ_BIT);
GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
// the CV stuff, which now has data from the glMapBufferRange
JNI.processCV();
Any help in this matter would be highly appreciated! I’ve never had to read data back from openGL every frame in real-time, so I’m at wits end here. Below is some more code so you guys can get an idea on how the logic flows.