[LWJGL] glTranslatef is really needed?

marcuiulian13 · June 2, 2012, 8:44pm

Hello everyone!
I am trying to render a tile map using LWJGL. I am using the following code to do this:

protected void rederGL() {
		glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
		
		for(int i=0; i < mapTiles.size(); i++) {
			((Tile) mapTiles.get(i)).renderTile();
		}
		
		for(int i=0; i < grid.size(); i++) {
			((Tile) grid.get(i)).renderTile();
		}
}

public void renderTile() {
		glBindTexture(GL_TEXTURE_2D, tileTextureID);
		glBegin(GL_QUADS);
			glTexCoord2f(0, 0);
			glVertex2f(x, y);
			glTexCoord2f(1, 0);
			glVertex2f(x+width, y);
			glTexCoord2f(1, 1);
			glVertex2f(x+width, y+height);
			glTexCoord2f(0, 1);
			glVertex2f(x, y+height);
		glEnd();
	}

I saw many people using glTranslatef when rendering stuff… I don’t really understand why and how, so i am asking you to say if is better or not to use glTranslatef and if it is, how to use it ?

Thanks! ;D

davedes · June 2, 2012, 9:50pm

glTranslate simply multiplies the current matrix (e.g. the model view matrix) by a translation matrix that looks like:

If that means nothing to you, maybe try starting at the beginning:
http://www.arcsynthesis.org/gltut/
http://ogldev.atspace.co.uk/
http://duriansoftware.com/joe/
http://en.wikibooks.org/wiki/OpenGL_Programming

Immediate mode and fixed function (glBegin/glVertex2f/glTranslate/etc) are old and deprecated… Learn the programmable pipeline instead.

EDIT:
And some LWJGL ports of OpenGL 3+ tutorials:
https://github.com/rosickteam/OpenGL

matheus23 · June 3, 2012, 10:10am

Actually, you don’t need to do any glTranslate stuff. That stuff is - just as davedes said - Matrix transformation stuff. The code you have now, should work, and it is acctually okey for a beginner.

Matrix transformations become intresting, when you want to calculate 3D vertices into screen-space coordinates/pixels, or if you want to rotate your view of the tiles.

theagentd · June 3, 2012, 12:56pm

Of course it’s easy to just do

glVertex2f(x + localX, y + localY); //localX = the relative coordinate (0 or width in your example code)

but what if you have a 3D model with 20 000 vertices? Each vertex has to be positioned relative to where the actual object is supposed to be rendered. Doing

glVertex3f(x + localX, y + localY, z + localZ); //Assuming 3D

for 20 000 vertices is 60 0000 floating point additions done on the CPU, plus that you can’t store the data on the GPU since it has to be updated on the CPU each frame, meaning that we’ll also transfer 20 000 x 3 floats x 4 bytes = 234,375kb per instance. Assuming something like 16 instances (for example 16 player models in a first person shooter game), and we have 320 000 vertices (easily handled by our GPU), 960 000 floating point additions per frame done on the CPU (NOT easily handled by our CPU) and also 219,7265625 MBs/sec of data transfered to our GPU at 60 frames per second.

By storing the vertex data on the GPU and then using glTranslatef() instead to position the whole set to where you want it, you can save so much work from the CPU for something that can be done free with the modelview matrix instead.

For example, if you want to “move the camera”, the proper way of doing this is to apply an offset to every tile in the game.

NO: Add this offset manually to everything you draw with glVertex2f().
YES: Add a single glTranslatef() to the beginning of the game loop and just draw everything with their world coordinates.

Matrices have the ability to move a coordinate from one coordinate system to another. The matrices in OpenGL are meant to transform coordinates from WORLD SPACE (= the positions the vertices have in the game world, not on the screen) all the way to “normalized device coordinates”. Normalized device coordinates are just coordinates which go from -1 to 1 for both X and Y, with (-1, -1) being the bottom left corner of the screen, an (1, 1) being the top right corner of the screen, regardless of the resolution game window. Since we have two matrices in OpenGL, we also get an intermediate coordinate system. For now I’ll stick to 2D to keep things simple.

Basically, you send your coordinates in world space. Send a tile’s position in the world. Never manually calculate where this tile would end up on the screen manually. OpenGL then transforms this by the modelview matrix. This matrix is supposed to take the coordinates from world space to view space. In the case of 2D, view space is most often simply the pixels of the screen, with the top left corner being (0, 0) and the bottom right corner being the (width, height) of the window in pixels. In other words, the modelview matrix takes your coordinates from where they are in the world to where they should be on the screen.

Now, by now we should be done, right? We have the coordinates on the screen of each vertex! But wait, OpenGL expects our coordinates in normalized device coordinates! This is a side-effect of OpenGL being made for 3D games and not primarily for 2D. Luckily there’s a function for this, and you most likely know it by heart: glOrtho(). This is what we put in the projection matrix.

Obviously, you can get the same visual result regardless of if you use glTranslatef() or not, but you get a lot better performance and you’re using OpenGL as you’re supposed to. For your code though, using glTranslatef() to position individual tiles make no sense whatsoever. There is no point in using glTranslatef() for anything that is static, since if it doesn’t move we can just reuse the same data over and over again.

Summary:

Use glTranslatef() when positioning large number of vertices dynamically each frame. In your case, use it for the movement of the camera.
Do not use glTranslatef() when position static data. Just precompute the data and store it on the GPU with a display list or VBO.

Your current code has a problem though: Each tile has its own texture. It is much better to have all tiles in a single texture, since the overhead of binding a new texture for each tile is pretty high, but more importantly it prevents you from being able to batch up your tiles. Preferably you’d want to draw as much as possible with the least number of OpenGL calls and state changes, and you’re doing over 2 calls per vertex at the moment. If your tiles were all in the same texture you could easily batch it into a display list (easy as pie, Google), and draw the whole map with only 2 OpenGL calls regardless of how many tiles you have, and with no state changes per tile. Note however that putting everything in one texture can make it possible to get bleeding between tiles if you don’t handle the texture correctly.

In the end, this might be completely overkill for you. If you’re drawing only drawing hundreds or even a few thousands of tiles per frame your current code will work perfectly fine as it is. If you realize that you need that CPU time for something heavy like physics or simply lots of objects, etc you could optimize it. I’m the guy who tries to make everyone make awesome graphics in their games with (correct) OpenGL, so I guess I’m the one to ask if you have any questions.

絶望した！言葉の壁で絶望した！
[spoiler]I’m in despair! The wall of words has left me in despair![/spoiler]

davedes · June 3, 2012, 4:34pm

theagentd
Say I’m moving toward the programmable pipeline, working on a 2D game. Right now I’m doing matrix calculation by hand on the CPU, but I’d like to move it to GPU to mimic more closely the benefits of glTranslate/glRotate/etc. Am I correct in assuming the best way to do this by passing the view matrix as a vertex attribute? (Passing down per-column.) I’m using a uniform for the ortho projection matrix in the shader since it doesn’t change per vertex.

theagentd · June 3, 2012, 5:38pm

You would not want a vertex attribute for that since that would force you to add the matrix for each VERTEX, which could force you to add 16 floats per vertex. It’s a very good idea to not duplicate data on the GPU. I assume you mean having matrices for each individual tile. Like I said, this is not needed. A matrix never changes per vertex, only per object, so they should almost always be uniforms. [spoiler](“Almost” since you can have matrix attributes if you’re doing instancing, in which case you use an attribute divisor to specify that this attribute is only moved to the next matrix for each instance instead of for each vertex.)[/spoiler]

[quote] - Do not use glTranslatef() when position static data. Just precompute the data and store it on the GPU with a display list or VBO.
[/quote]

marcuiulian13 · June 3, 2012, 7:12pm

Thank you very much theagentd. (incredible but) I really understanded what you explained to me. From now on, be sure that you will be first that i will ask when i will have any major problem. Thank you again.

davedes · June 3, 2012, 8:17pm

Here’s my dilemma: To reduce repetitive draw calls I’m “batching” tiles/sprites/polygons into the same list of triangles, similar to LibGDX’s sprite batch. So I can’t set a view matrix uniform per sprite, without ‘flushing’ the batch each sprite.

Instancing is not really viable since I’m targeting OpenGL 2.0.

How about this: Batch as many sprites as possible until the user requests a new matrix (i.e. rotate/translate/etc), then “flush” (glDrawArrays) and set the new uniform. Basic transformations like trying to draw a quad sprite with a rotation can be done on CPU, to avoid an unnecessary draw call. This seems better than setting a uniform per tile/sprite/poly.

What do you think? Or am I going in the wrong direction?

Probably sounds like premature optimization. I’m doing this mostly out of curiosity and a desire to learn different approaches.

theagentd · June 3, 2012, 9:37pm

davedes:

Here’s my dilemma: To reduce repetitive draw calls I’m “batching” tiles/sprites/polygons into the same list of triangles, similar to LibGDX’s sprite batch. So I can’t set a view matrix uniform per sprite, without ‘flushing’ the batch each sprite.

Instancing is not really viable since I’m targeting OpenGL 2.0.

How about this: Batch as many sprites as possible until the user requests a new matrix (i.e. rotate/translate/etc), then “flush” (glDrawArrays) and set the new uniform. Basic transformations like trying to draw a quad sprite with a rotation can be done on CPU, to avoid an unnecessary draw call. This seems better than setting a uniform per tile/sprite/poly.

What do you think? Or am I going in the wrong direction?

Probably sounds like premature optimization. I’m doing this mostly out of curiosity and a desire to learn different approaches.

This is a really hard problem to solve. If you’re getting enough performance with what you have you’re in luck. It’s really difficult to cram out enough performance out of OpenGL for 2D sprites since we require so fine-grained control over each sprite, so the amount of data we’re processing is huge.

Your idea sounds pretty good, but only if the matrix isn’t changed between every sprite. For example if you’re making a top-down game, 99% of the sprites will be rotated randomly. In this case, we can’t afford doing the rotation with matrices, so we can pretty much have to resort to doing the rotation manually on the CPU. This sounds like a step back from what I just said, but it’s the best solution so far. For this I recommend that you take a look at the source code for Revenge of the Titans, found here, and check out com.shavenpuppy.jglib.sprites.DefaultSpriteRenderer. Cas must have spent lots of time optimizing it, and I’ve pretty much concluded that that is the best algorithm possible. It’s really, really flexible, so you could probably get by with a much simpler version if you want to. I’ve pretty much dissected it completely, so if you have any questions about it just ask. It’s more complicated that it seems. =P

I had an idea of offloading the translation, rotation and scaling to the GPU via shader, but even though that eliminated a sin() and a cos() per sprite, the transformation data needed to be duplicated per sprite. Ouch. The increased memory usage pretty much made me end up at equal or worse performance than Cas’ one, so I dropped that idea.

There is an ultimate solution, which just like instancing requires OpenGL 3. It offloads everything to the GPU with the help of a geometry shader. We just have to give the data needed to construct the sprite as a point to the GPU, and then expand it to a quad in the geometry shader. Since every sprite is sent as a single vertex we won’t need to duplicate the data, and we won’t need to do anything on the CPU except for loading up the buffer. You might have heard that geometry shaders are slow when abused, but this is the main usage of them. Since you’re pretty much always CPU limited in 2D games anyway, offloading stuff to the GPU is the right thing to do.

sproingie · June 4, 2012, 4:03am

Am I the only one who thinks there’s also a theagentf, theagenti, and theagentb?

;D

ra4king · June 4, 2012, 8:48am

Nah there’s only theagentd, he needs as much precision as possible! ;D

theagentd · June 4, 2012, 1:03pm

This is the most epic thing I’ve ever read on this forum. Heck, maybe on the whole Internet. I thank thee for this laugh.

davedes · June 5, 2012, 3:38am

[quote=“theagentd,post:9,topic:38999”]
Last time I tested this it seemed to work well. Any real downsides to choosing this, other than the obvious shape limitation? (no polys) It seems to be exposed through extension even on older cards (MacOSX list), and works on all the computers I’ve tested.

theagentd · June 5, 2012, 6:16am

[quote=“davedes,post:13,topic:38999”]

The only real down side is the hardware requirements and possibly the inflexibility of it. Due to this you pretty much need to custom make your sprite renderer for the requirements of your own game but that’s about it.