I was wondering about this and I’m hoping you guys can help me out.
I hope I can explain this, so please bare with me.
I come from the world of DirectX 11 where you can use / setup Dynamic Vertex and Index Buffers.
Which have ‘lock flags’, these log flags basically tell the GPU how data is going to be treated. Well when you set up Dynamic Vertex Buffers and Index Buffers you use a DISCARD and NOOVERWRITE flag. The DISCARD flags says trash everything in the buffers, we don’t care about it. Where as the NOOVERWRITE flag says anything that is already in use, you can’t have.
Using these both you can create a circular buffer system. since the DISCARD flag does this special thing that creates a new place in mem for the GPU to use. Something like the below is a way to get a circular buffer:
//Code done when needing to draw, places the draw data directly into the vertex buffer
if(vertexBufferPosition + vertexdPerQuad > maxVertexAllowed)
{
//Says that we are done manipulating draw data
batchContext->Unmap(vertexBuffer, 0);
//Go ahead aand send everything of to the GPU and draw all that needs to be ddrawn
batchContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
batchContext->DrawIndexed(indexCount, 0, 0);
indexCount = 0;
//Reset our stuff and since we were out of space, give us another spot in mem to use (Setting the mapFlag to DISCARD)
vertexBufferPosition = 0;
mappingFlag = D3D11_MAP_WRITE_DISCARD;
//Relock the buffer so we can add more GPU data, also since we are using a fresh buffer, set the lockflag to NOOVERWRITE
batchContext->Map(vertexBuffer, 0, mappingFlag, 0, &mapVertexResouce);
mappingFlag = D3D11_MAP_WRITE_NO_OVERWRITE;
}
//Draw data placed directly into the vertex buffer
addDrawData(x,y,width,height);
SO my question, can this be done in openGL? Is there such things as lockflags and mapping?
I understand how to place data into a vertex buffer based on some tutorials, but that was for static buffers
How should I be placing data into a buffer when it needs to change every frame or so? For things such as a SpriteBatcher or Partical emitter? I should also mention I’m trying to target openGL 3.3 and up
DirectX is made for game development if I’m correct.
OpenGL has nothing to do with games. It only renders primitives that you specify.
You can use immediate mode, vertex arrays, display lists, vertex buffer objects, frame buffer objects and some more.
If you want to make something like “flags” in OpenGL, you would need to code it yourself. After you send data to the GPU, you can edit it by glMapBuffer(int target); This would give you a ByteBuffer. You can now edit it to your liking.
My answer is not what you wanted, because I don’t really understand what is going on on DirectX.
OpenGL is very different to DirectX. I’d recommend reading this book.
Buffer data is added to a buffer object using the command [icode]glBufferData(int target, Buffer buffer, int usage)[/icode]. The usage is what you’re wanting to change. If you want to update the buffer data every frame, you’d want to change it to GL_STREAM_DRAW. Then you can simply overwrite the data using either [icode]glBufferData[/icode] or [icode]glBufferSubData[/icode]. This page on this wiki is quite good.
Before things go in the wrong direction, it kinda seems like I’m implying that I have no knowledge of how openGL works
I know about VBOs, VAOs, IBOs, and etc. I know DirectX and openGL are very different.
My question, unfortunately, is all over the place. I really want to know from a performance standpoint for 2D work. What is the best way or what way should I be sending data to the GPU?
In the sense like Troubleshoots pointed out, I would want to use the usage flag GL_STREAM_DRAW, because I plan for data to change at least every frame. BUT also I want to know everyone thoughts on the type of buffer system to use
When I mentioned DirectX and gave an example that was a circular buffer. Essentially we were always able to stream data to the GPU and it would prevent blocking calls. Since we are able to get a new location in memory to use. Well I’m not sure if this is possible to do in openGL, not possible in that there are ‘special’ glMapBuffer() params or something that says if we are out of space in the buffer just give us a new one. If it is awesome and how/should can it be done?
If not then lets talk about a double buffer system, where we load data into one, send it off to the GPU, fill the secondary buffer with data, and rinse and repeat.
I guess what I’m truly asking at the end of the day is if I cannot make a circular buffer (With ‘special’ params or etc), how should I be doing a double buffer system? Not in the sense of what needs to be done at high level, but what are things I need to worry about?
How should my openGL calls look? When am I going to be blocked by the GPU? Things of that nature
Since DirectX and OpenGL is so much more different and because you have so many questions I highly suggest reading an up-to-date book about OpenGL.
Sadly, they’re pretty hard to get but from what I hear OpenGL Superbible’s 6th edition is quite good. Also OpenGL 4.0 Shading Language Cookbook is a great piece, however, they’re quite expensive. :’(
Could you explain what do you mean by “blocking calls”?
I don’t know how exactly [icode]glMapBuffer()[/icode] works so I can’t help you with that one.
OpenGL contexts are automatically double buffered and you’re always drawing to the back buffer (from default). After you’re done with your frame you can swap the buffers (LWJGL automatically does this for you with [icode]Display.update()[/icode]) so the user will see the rendered frame in the front buffer, and you can work on the back buffer that is currently filled with the frame before (that’s why you have to clear the color/depth/stencil buffers so you clear those from the current back buffer).
glMapBufferRange with UNSYNCHRONIZED flag has the least overhead of all VBO operations that move data around.
as it’s unsynchronized (the driver doesn’t lock anything at all) you need a queue of at least 4-5 VBOs and take/release one every frame, or you’ll get corrupted vRAM.
I have the openGL superbible 5th edition, but like you mention its up to date
Also for me books are hit and miss. They are great for learning how to do something, but I feel like they don’t get down on the nitty gritty stuff like I’m looking for imo
By blocking calls I mean the mutex lock (on hardware level) that is done by the GPU when it is using a buffers data. I understand that the GPU can stall and shutout the CPU until its done finishing the commands it is running wiht the buffer.
The idea behind it is pretty simple in all honesty
You have 2 Vertex Buffers and you alternate between them each frame. So beginning with frame 1 you have 2 blank buffers. You fill one and send it to the GPU for drawing. Then on the next frame you fill the next Vertex Buffer, while the GPU is still working on the first one. By doing this you are supposed to be able to prevent the GPU from locking down on the Vertex Buffers. I have read that this works great on both ATI and Nvidia cards
At both Riven and Panda, since you guys kinda go together on this one
I have read that you can do a technique called ‘orphaning’ using either the glMapBuffer call or the glMapBufferRange call
Where the idea is that when using the glMapBuffer call, you use glBindBuffer with a null pointer to the Vertex Buffer. And what this does is, its supposed to, tell the driver hey give me a spot in mem that I can use because I need it now. And then y follow up with glMapBuffer() and place all your data in the vertex buffer. Send it off to the GPU and etc. This allows you to have multiple Vertex Buffers in flight where they then can be processes by the GPU. I heard you can do this with the glMapBufferRange call to using some a couple of flags, but I have heard (using glMapBufferRange) can be a real performance killer if you do it wrong.
That is why I’m kinda leaning to the Double VBO approach, am not not sure how I should be setting it up I guess. My ideas on implimentation are shady. As anyone has some experience with this?
I was thinking some thing like
public void endRender()
{
/* Other required items above, core VBO stuff below*/
//Enable the Vertex Buffer, bind it for the Shader
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER, currentVertexBuffer);
glVertexAttribPointer(0, 3, GL_FLOAT, false, 0, 0);
//Enable the UV Buffer, bind it for the Shader
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, currentUvBuffer);
glVertexAttribPointer(1, 2, GL_FLOAT, false, 0, 0);
//Bind the STATIC Index Buffer (Since it is always the same slot wise)
//Draw the number of elements needed
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBuffer);
GL11.glDrawElements(GL_TRIANGLES, numberOfIndex, GL11.GL_UNSIGNED_INT, 0 );
/Disable all the attributes for safty
glDisableVertexAttribArray(0);
glDisableVertexAttribArray(1);
glDisableVertexAttribArray(2);
//Swap the buffers
bufferToUse = !buffertoUse;
currentVertexBuffer = vertexBuffers.get(bufferToUse);
currentUvBuffer = uvBuffers.get(bufferToUse);
}
public void addDrawData(/*Draw data params */)
{
if(/* The vertex buffer limit is going to be over or a texture swap is needed*/)
{
//Send stuff to the GPU
endRender();
//Reset / clear the buffer. Prepare it for writing
currentVertexBuffer.clear();
currentUvBuffer.clear();
}
// Add the data for drawing into the buffer
currentVertexBuffer.put(/*Some Vertex data for vert 1*/);
currentVertexBuffer.put(/*Some Vertex data for vert 2*/);
currentVertexBuffer.put(/*Some Vertex data for vert 3*/);
currentVertexBuffer.put(/*Some Vertex data for vert 4*/);
currentUvBuffer.put(/*Some UV data for vert 1*/);
currentUvBuffer.put(/*Some UV data for vert 2*/);
currentUvBuffer.put(/*Some UV data for vert 3*/);
currentUvBuffer.put(/*Some UV data for vert 4*/);
}
That, and who cares that glMapBufferRange is slow if you use it wrongly? A supercar chokes on diesel too. I’ve never heard that as an argument against it
Okay, so while I’m still a little uneasy about glMapBuferRange. I figure I give it a try, now I’m not sure if I did this right but…
This is what I got, feel free to tell be I’m going about this all wrong.
//My main draw method
public void Render()
{
batcher.beginRender(); //Sanity Checks if we are calling the right methods (Don't call endRender before begin and etc
//Queue the VBO with draw data!
for(int i = 0; i < 6000; i++)
batcher.draw(randX(), randY(), 32.0f, 32.0f);
//End the render; draw whatever we got in the VBO
batcher.endRender();
}
/* All below apart of a Batcher Class; Has a Vertex Buffer and UV Buffer set to GL_STATIC_DRAW. Has a index buffer that is static and filled
from the start based on the max size of the vertex buffer. Lets assume its 10000 verts */
//My Draw method, get everything to the screen
public void endRender()
{
//Bind the texture
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, currentTexture);
glUniform1i(texture, 0);
//Enable the vertex buffer
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER, vertexBufferHandle);
glVertexAttribPointer(0, 3, GL_FLOAT, false, 0, 0);
//Enable the uv buffer
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, uvBufferHandle);
glVertexAttribPointer(1, 2, GL_FLOAT, false, 0, 0);
//Draw everything needed based n index buffer
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBufferHandle);
GL11.glDrawElements(GL_TRIANGLES, indexCount, GL11.GL_UNSIGNED_INT, 0 );
//Safety disable
glDisableVertexAttribArray(0);
glDisableVertexAttribArray(1);
}
//My put some data into the VBOs method; places data into the Vertex and UV vbo; Checks if orphans need making
public void addDrawData(float x, float y, float width, float height)
{
//Check if we will be over the limit
if(vertexBufferPosition + 4 > vertexBufferMax)
{
//End the render; draw stuff
endRender();
//Orphan the buffer
glBufferData(GL_ARRAY_BUFFER, 0, GL_STREAM_DRAW);
vertexBufferPosition = 0;
indexCount = 0;
}
//Map the vertex buffer
vertexBuffer = glMapBufferRange(GL_ARRAY_BUFFER, vertexBufferPosition, 4, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
vertexBuffer.put(x);
vertexBuffer.put(y);
vertexBuffer.put(x + width);
vertexBuffer.put(y + height);
glUnmapBuffer(GL_ARRAY_BUFFER);
//Map the vertex buffer
uvBuffer = glMapBufferRange(GL_ARRAY_BUFFER, uvBufferPosition, 8, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
uvBuffer.put(0);
uvBuffer.put(0);
uvBuffer.put(0);
uvBuffer.put(1);
uvBuffer.put(0);
uvBuffer.put(1);
uvBuffer.put(1);
uvBuffer.put(1);
glUnmapBuffer(GL_ARRAY_BUFFER);
//Add 4; because 4 vertex per quad
vertexBufferPosition += 4;
//Add 8; because 2 per vertex and there are 4 vertex per quad
uvBufferPosition += 8;
//Add 6; because of 6 indices per quad
indexCount += 6;
}
Also is my double buffer idea is correct? I have never set one up before so I’m not sure if I got the implementation down
Yea, I woke up this morning and though about it was like ‘ooo you messed up’ haha
So like you said, I think I should really be doing this. The on thing I;'m not too sure about is the unmap part since there is no specific link to the buffer I want to unmap.
Maybe I’m missing something or maybe I have to interleave them for this to work?
//My main draw method
public void Render()
{
batcher.beginRender(); //Sanity Checks if we are calling the right methods (Don't call endRender before begin and etc
//Queue the VBO with draw data!
for(int i = 0; i < 6000; i++)
batcher.draw(randX(), randY(), 32.0f, 32.0f);
//End the render; draw whatever we got in the VBO
batcher.endRender();
}
/* All below apart of a Batcher Class; Has a Vertex Buffer and UV Buffer set to GL_STATIC_DRAW. Has a index buffer that is static and filled
from the start based on the max size of the vertex buffer. Lets assume its 10000 verts */
//My begin render method
public void beginRender()
{
/*Sanity Checks */
//Map the vertex buffer
vertexBuffer = glMapBufferRange(GL_ARRAY_BUFFER, vertexBufferPosition, 12, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
//Map the vertex buffer
uvBuffer = glMapBufferRange(GL_ARRAY_BUFFER, uvBufferPosition, 8, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
}
//My Draw method, get everything to the screen
public void endRender()
{
//Bind the texture
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, currentTexture);
glUniform1i(texture, 0);
//Enable the vertex buffer
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER, vertexBufferHandle);
glVertexAttribPointer(0, 3, GL_FLOAT, false, 0, 0);
//Enable the uv buffer
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, uvBufferHandle);
glVertexAttribPointer(1, 2, GL_FLOAT, false, 0, 0);
//Draw everything needed based n index buffer
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, indexBufferHandle);
GL11.glDrawElements(GL_TRIANGLES, indexCount, GL11.GL_UNSIGNED_INT, 0 );
//Safety disable
glDisableVertexAttribArray(0);
glDisableVertexAttribArray(1);
}
//My put some data into the VBOs method; places data into the Vertex and UV vbo; Checks if orphans need making
public void addDrawData(float x, float y, float width, float height)
{
//Check if we will be over the limit
if(vertexBufferPosition + 12 > vertexBufferMax)
{
//Unmpa the buffers; called twice: one for the vertex buffer one for the uv buffer; May need to interleave?
glUnmapBuffer(GL_ARRAY_BUFFER);
glUnmapBuffer(GL_ARRAY_BUFFER);
//End the render; draw stuff
endRender();
//Orphan the buffer
glBufferData(GL_ARRAY_BUFFER, 0, GL_STREAM_DRAW);
vertexBufferPosition = 0;
indexCount = 0;
//Re map the buffers
vertexBuffer = glMapBufferRange(GL_ARRAY_BUFFER, vertexBufferPosition, 12, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
uvBuffer = glMapBufferRange(GL_ARRAY_BUFFER, uvBufferPosition, 8, GL_MAP_WRITE_BIT | GL_MAP_UNSYNCHRONIZED_BIT, null).asFloatBuffer();
}
//Put the 4 verts (x, y, and z coords [ Z coords defaults to 1.0f ]) for a quad
//Vert1
vertexBuffer.put(x); vertexBuffer.put(y); vertexBuffer.put(1.0f);
//Vert 2
vertexBuffer.put(x+width); vertexBuffer.put(y);vertexBuffer.put(1.0f);
//Vert 3
vertexBuffer.put(x + width); vertexBuffer.put(y + height); vertexBuffer.put(1.0f);
//Vert 4
vertexBuffer.put(x); vertexBuffer.put(y+height);vertexBuffer.put(1.0f);
//Tex Coord for Vert 1
uvBuffer.put(0); uvBuffer.put(0);
//Tex Coord for Vert 2
uvBuffer.put(0); uvBuffer.put(1);
//Tex Coord for Vert 3
uvBuffer.put(0); uvBuffer.put(1);
//Tex Coord for Vert 4
uvBuffer.put(1); uvBuffer.put(1);
//Add 12; because 4 vertex (With X Y Z coords) per quad
vertexBufferPosition += 12;
//Add 8; because 2 per vertex and there are 4 vertex per quad
uvBufferPosition += 8;
//Add 6; because of 6 indices per quad
indexCount += 6;
}
I was under the impression that you had lots of experience under your belt, given that you came from C/DirectX.
There are quite a few conceptual errors in your latest code dump (like pushing more texcoords than vertices, mapping without binding specific VBOs, combining unsynced mapping and orphaning, never writing indices in your index buffer, interleaving vertex attributes in overlapping mapped buffers, while writing into each sequentially, using offets when using distinct VBOs for your attributes, mapping teeny tiny ranges, not even enough to hold one quad). I’d advise you to go all the way back to Vertex Arrays (not VBOs, not mapping them, and not even close to unsynced mapped VBOs) and get it working from there (starting with 1 triangle or quad) – then work your way up again, if the need arises.
Coming from C++ / DirectX has no true ground when dealing with openGL. Its a different API, things are handled differently. Yes, concepts are the same but implementation is another thing
Pushing more Text coords
This is my fault, when I think of Vertex I think of the Vertex Struct I can do in C++/DirectX. Since I can’t do this in java/openGL, then you are right I am missing 8 more puts (The remaining X/Y coords and the Z coord)
Should be fixed in the above code
Mapping without binding specific VBOs
Maybe I am not thinking of this right, but I would do this in the constructor / init of the Batcher. Please correct me if I’m wrong, but I can/only need to do this (Bind the specific buffers) once right? Or does this need to be done multiple times? Also am I not allowed to bind multiple buffers at once? Again maybe interleaving is the solution to this
Combining unsynced map and orphaning
I’m not too sure about this one, as this topic is about implementing orphaning or a double buffer.
I appreciate the time you’re taking to explain this, unfortunately I guess things are not clicking
Not filling the index buffer
The code is just meant to show the three methods (endRender, beginRender, addDrawData) and how they work together. My idea was to have the index buffer filled once the Batcher class was created. There is no need in my mind to have this refilled with the same data value each time, since there is a max size for each buffer
interleaving vertex attributes in overlapping mapped buffers, while writing into each sequentially
I don’t think I do this? I mentioned that maybe I need to do this because of the mapping call and use of UV coords
using offets when using distinct VBOs for your attributes
I don’t think interleaving? No need for offset until I do so right?
mapping teeny tiny ranges / not even enough to hold one quad
I’m confused about this one, could you explain? My comments say that the vertex buffer would hold 10000 verts and the UV buffer would hold the required amount. Maybe this was in relation to the first issue
Now with that being said, I do see one thing that is wrong, and that is the glMapRange length param usage (pertaining to the latest code). What if I don’t know how much I’m going to map at a given time? What if I want to map until I say so (eg because I have ran out fo room or I have to swap a texture)? How is a undefined length done?
That kinda kills the idea of mapping a range though, so would I then use just glMapBuffer? Whats the difference in terms of performance?