What I did today

@theagentd you are smart as fuck. I have no clue what most of that meant at all lol.

Thanks, but you can see that as me being bad at explaining too. =P If you have any specific questions, Iā€™d be glad to answer them.

I just had my yearly review and got a very significant raise! I just write acceptance tests with Codeception and owe this site a lot for learning so much more about coding in general from all of you guysā€¦so thanks all!

I appreciate it! But I need to figure out how to use tiles with a camera in LibGDX firstā€¦lol

Today I found out Iā€™m being replaced at work. Not a bad innings - 3 years out of this one.

Anyone need a programmer for a bit?

Cas :slight_smile:

Why the hell are they doing that?

Found 30 minutes to progress a new tutorial on simple 2D weapon movement:

Few quick fixes to a running calculator I made to help with race planning etc:

http://carelesslabs.co.uk/run/

Just shaking things up for the hell of it I think.

Cas :slight_smile:

This graphics abstraction sounds pretty similar to the goals of BGFX (plus supports many more backends/platforms and has been thoroughly tested and used in high profile projects) and considering LWJGL3 now has bindings for the same, does it still make sense to roll you own? or are there other reasons for doing so?

Iā€™ve been working on an idea for an Android/iOS app in the music space that multiple people have already committed to buying, so Iā€™m very excited. This is probably one of my longest lasting personal projects (no surprise there!).

I havenā€™t touched the iOS side yet, but I started learning Kotlin because Iā€™m using it for the Android app. And oh man, it really is just what a ā€œmodernā€ version of Java should be. Such a fantastic language to work with! I really hope Java gains a type inference system in an upcoming release. I believe I heard rumblings of that happening in JDK 10?

I was aware of BGFXā€™s existence to the extent that I knew that LWJGL had a binding for it, but I wasnā€™t entirely sure what its purpose was just from checking the Javadocs in LWJGL. Iā€™ve looked it up a bit now and it does indeed seem to have the same purpose as the abstraction Iā€™ve started on. I canā€™t make any conclusive statements about it; Iā€™ll need to look into it more, but it doesnā€™t seem like a perfect match from what I can tell.

So far, the most glaring missing feature is uniform buffers, instead using the old uniform system which has much worse performance than uniform buffers. For example, with forward+ shading Iā€™ll be uploading a large amount of light data to uniform buffers which are then read in every single rendering shader. With uniform buffers, I can lay out and upload the data much more efficiently using buffers, bind it once and leave it bound for the entire render pass. Without them, Iā€™ll have to reupload all the uniforms for each shader that needs them with one call for each variable.

Secondly, itā€™s unclear to what extent BGFX supports multithreading. It claims that the rendering commands are submitted from a dedicated rendering thread (so am I) to improve performance, but this is essentially what the OpenGL driver already does. I canā€™t tell from the research I did if itā€™s possible to construct efficient command buffers from multiple threads with BGFX. I suspect it does support that, but if it doesnā€™t thatā€™d obviously hold back performance quite a bit.

The biggest problem is just the entire approach that BGFX has. The level of abstraction is at the level of OpenGL, meaning that a lot of the gritty details are hidden. When I started learning Vulkan and trying to find common points between Vulkan and OpenGL, I quickly came to the conclusion that emulating OpenGL on Vulkan is completely pointless. If you try to abstract away the gritty details like image layouts, descriptor sets, multithreading, uniform buffers, etc like OpenGL does, youā€™re setting yourself up for essentially writing a complete OpenGL driver. The problem is that itā€™s going to be a generic OpenGL driver that has to manage everything in suboptimal ways, instead of an actual dedicated driver for the specific hardware you have, and seriously, youā€™re not gonna be able to write a better driver than Nvidia for example.

The key thing I noticed was that itā€™s a really bad idea to emulate OpenGL on top of Vulkan, but emulating Vulkan on top of OpenGL is actually super easy. Emulating descriptor sets on OpenGL is super easy and you can keep a lot of the benefits of descriptor sets. Image layout transitions can just be completely ignored by the abstraction (the OpenGL driver will of course handle that for us under the hood). We can emulate command buffers to at least get some benefit by optimizing them on the thread that compiles them. In other words, writing a Vulkan implementation that just delegates to OpenGL is easy. Writing an OpenGL implementation that delegates to Vulkan, oh, have fun. Hence, if you expose an API that requires the user to provide all the data needed to run the API on Vulkan/DirectX12/Metal, adding support for anything older like OpenGL is trivial.

[quote=ā€œtheagentd,post:5851,topic:49634ā€]
See #1231.

[quote=ā€œtheagentd,post:5851,topic:49634ā€]
The build that ships with LWJGL supports up to 8 threads (the default) submitting draw calls. See the encoder API (bgfx_begin, bgfx_encoder_*).

[quote=ā€œtheagentd,post:5851,topic:49634ā€]
Apparently, after MoltenVK, Khronos will be working on Vulkan emulation libraries on top of Direct3D 12 and OpenGL. So, if anyoneā€™s planning to learn Vulkan seriously, eventually it will be a pretty good investment. If you donā€™t have the time for that and just want to get robust results quickly, bgfx is a very good choice for targeting GL/D3D12/Metal.

Imagine you want to draw 300 3D models in your game. The vertex and index data are all in shared buffers and they all use the same shader and same textures, but a single uniform vec3 for the position offset of each model is changed inbetween each call to place each model at the right position. You also have a large number of uniforms for controlling lighting (essentially arrays of light colors, positions, radii, shadow map matrices, etc), but these are of course set up once. You essentially have code that looks like this:


//Upload lighting data
glUniform3fv(lightColorsLocation, 100, ...);
glUniform3fv(lightPositionsLocation, 100, ...);
glUniform1fv(lightRadiiLocation, 100, ...);
glUniformMatrix4fv(shadowMatricesLocation, 100, ...);

for(Model m : models){
    glUniform3f(positionOffsetLocation, m.getX(), m.getY(), m.getZ());
    glDrawElementsBaseVertex(...);
}

This will perform so bad itā€™s not even funny, and itā€™s easy to explain why. GPUs do not have support for ā€œuniform variablesā€. They source uniform data from buffers, either in RAM or VRAM. This means that the GPU will create a uniform buffer layout for us based on the defined uniforms in our shader and place our data in a buffer. Great, no problem. We end up with a buffer that has the light colors, positions, radii and shadow matricesā€¦ and then the position offset. Then we change the position offset inbetween each draw call. The problem is that the GPU hasnā€™t actually executed those commands yet, so we canā€™t write over our previous uniform data. Because of that, the driver will create a full copy of the entire uniform buffer, including all the light data, giving each draw call its own version of the uniform buffer, with only the position offset actually differing between them. This leads to a lot of hidden memory usage within the driver, horrendous CPU performance (spiking on either glUniform3f() as it allocates a copy each time, or on buffer swapping if a multithreaded driver is used). The fact that all the uniform variables are placed in the same buffer makes it impossible to change a few of them without making an entire copy.

This is what the Nvidia driver does, and the exact problem I got working on Voxoid for Casā€¦ except in my case, I already had my massive lighting data in a (manually created separate) uniform buffer. The only thing being copied around where the per-scene/per-camera attributes like fog uniforms and camera matrices, and even then, that copying was enough to completely kill CPU performance. The above example would probably drop below 60 FPS around ~100 draw calls or less.

Sure, bgfx could try to add some heuristics to this whole thing to try to figure out the update rate of different uniforms and assign them to different uniform buffers that can be swapped individuallyā€¦ aaaaand before you know it you have an inefficient as f**k behemoth that uses heuristics for which uniforms to place in which uniform buffers, more heuristics to figure out whether to place the buffer in RAM or VRAM, and even more heuristics detect arrays of the same size and group them into structs for memory locality, and now the user has to train the heuristics to not bork out and max out the RAM of your computer for anything more complex that the most trivial possible use of uniforms, the whole engine crashes when it tries to use more than the maximum number of uniform buffers, etc. Youā€™ve just created an unholy mess that has much worse performance due to the overhead of running the heuristics calculations even for well-written code, and you got crazy amounts of bugs and spaghetti interaction between seemingly unrelated things. Hmm, what does that remind me ofā€¦? Right, OpenGL. None of these are exaggerations either, by the way. Do anything at all unconventional with buffer object mapping/updating on the Nvidia driver and youā€™ll start to see all kinds of cool colors and shapes floating in front of you as you feel the sweet LSD overdose kick in which you took trying to deal with the fact that your buffer object test program performs completely differently depending on which order you test the different methods in, and one order semi-reliably causes a massive memory leak in the driver, but I digress. (disclaimer: donā€™t do drugs)

All this because to avoid having the user having to lay out the data and upload it to a buffer him/herself.

EDIT: In the above case, I solved it by completely getting rid of the glUniform3f() call by placing the data in a per-instance vertex attribute buffer, and using base instance to pick a value for each draw call. This would of course have solved the issue for bgfx as well in this case, but even then, the uniform buffer approach is vastly superior. If each of those models would need a different shader, youā€™d have to recall the uniform setters after each shader switch, and OpenGL/bgfx would still need to duplicate the light data for each shader. With uniform buffers, you could just leave the uniform buffer (or the Vulkan descriptor set) bound and it all just works out with absolutely no additional cost.

EDIT2: An additional benefit of uniform buffers is the ability to do arrays-of-structs instead of structs-of-arrays, which should have better memory locality.

EDIT3:

ā€¦And thatā€™s how NNGINE saved Christmas lol

[quote=ā€œtheagentd,post:5853,topic:49634ā€]
This is a very specific rendering scenario, itā€™s anything but typical. Iā€™d never expect bgfx to be useful to you, but not everyone does deferred/forward+ shading with hundreds of lights.

[quote=ā€œtheagentd,post:5853,topic:49634ā€]
I donā€™t have a reason to dispute that, sounds reasonable. But also like something that a clever driver would trivially optimize (does it really have to use a single uniform buffer for everything internally?). Could you post a source that verifies the above?

[quote=ā€œtheagentd,post:5853,topic:49634ā€]
Yeah, bgfx doesnā€™t do anything like that internally, itā€™s a fairly simple abstraction over the low-level rendering APIs. I wouldnā€™t be surprised if graphics drivers used such heuristics though (see above).

[quote=ā€œtheagentd,post:5853,topic:49634ā€]
Yes, thatā€™s what bgfx does for instancing. Thatā€™s also what I was doing ~10 years ago, back then UBOs/TBOs were generally a bad idea on all drivers/GPUs iirc.

Anyway, bgfxā€™s developer has acknowledged that better support for UBOs is necessary and itā€™s coming soon. There will be support for uniforms that are updated at different granularities (per-frame, per-view, per-draw), with the appropriate backend-specific implementations. Also, this all applies to Linux only, you wouldnā€™t use the OpenGL backend on Windows/macOS (D3D11/12 & Metal respectively).

What do you mean you wouldnā€™t use the opengl backend on windows?

[quote=ā€œabcdef,post:5856,topic:49634ā€]
bgfx defaults to using Direct3D 11 on Windows, but you can force it to use one of the D3D12/D3D9/GL/GLES backends. Vulkan will also be supported in the future. What I mean is that, given the choice, thereā€™s no good reason to prefer OpenGL on Windows over D3D/Vulkan.

Iā€™d be very interested if youā€™d like to share that code! :slight_smile:

I rewrote my ā€˜asset-pipelineā€™ configuration. I had to write nasty json files by hand in the past and now use Jsr223.

This allows me to write simple things like this:

def hexOrigin(sprite){ sprite.setOrigin(sprite.width / 2, sprite.width / Math.sqrt(3))}

createTile("tiles/dirt"){
    sprite = createSprite("tiles/dirt-sprite", file("Dirt.png")){ hexOrigin(this) }.tag
}
createTile("tiles/dirt_full"){
    sprite = createSprite("tiles/dirt_full-sprite", file("Dirt_Full.png")){ hexOrigin(this) }.tag
}
createTile("tiles/grass"){
    sprite = createSprite("tiles/grass-sprite", file("Grass.png")){ hexOrigin(this) }.tag
}
createTile("tiles/grass_full"){
    sprite = createSprite("tiles/grass_full-sprite", file("Grass_Full.png")){ hexOrigin(this) }.tag
}

and automaticly convert them into ā€˜my old formatā€™:

[...]"tiles/dirt": {
		"class": "Tile",
		"sprite": "tiles/dirt-sprite"
	},
	"tiles/dirt-sprite": {
		"class": "Sprite",
		"width": 65,
		"height": 89,
		"originX": 32.5,
		"originY": 37.52776749732568,
		"ingameWidth": 1,
		"file": "tiles\\Dirt.png"
	},[...]

Everything with a neat gradle plugin, dependency/reference checking error messages when some constraints are not holding aso.
-ClaasJG

Itā€™s hard to give you access to all the code but I can tell you the general principals

  1. The triangulation for the filling of the shapes was based on ā€œSweep-line algorithm for constrained Delaunay triangulationā€ by V. Domiter and and B. Zalik
  2. The stroke drawing was just basic LINE_LOOP of the various contours provided
  3. The drawing of shapes was done using a similar api that we go uses but adapted to be easier to use in my opinion. you have a shape which can have many closed contours, you can line to somewhere, beside to somewhere, create circles, ovals, rectangles, rounded rectangles etc.
  4. This data feeds the algorithm and you get a list of triangles finding out which uses all your point (nomore are created).
  5. when you get the triangles I find a bounding box then calculate texture coordinates for all points and then use this for filling with either a texture, a single colour or a gradient colour.
  6. I donā€™t have anti aliasing yet but will add later one I have built all my ui components
  7. my ui components can be drawn with the vector api

Thatā€™s basically it, it looks pretty cool and you have lots of holes in a shape and it textures perfectly around it