I just had my yearly review and got a very significant raise! I just write acceptance tests with Codeception and owe this site a lot for learning so much more about coding in general from all of you guysā¦so thanks all!
This graphics abstraction sounds pretty similar to the goals of BGFX (plus supports many more backends/platforms and has been thoroughly tested and used in high profile projects) and considering LWJGL3 now has bindings for the same, does it still make sense to roll you own? or are there other reasons for doing so?
Iāve been working on an idea for an Android/iOS app in the music space that multiple people have already committed to buying, so Iām very excited. This is probably one of my longest lasting personal projects (no surprise there!).
I havenāt touched the iOS side yet, but I started learning Kotlin because Iām using it for the Android app. And oh man, it really is just what a āmodernā version of Java should be. Such a fantastic language to work with! I really hope Java gains a type inference system in an upcoming release. I believe I heard rumblings of that happening in JDK 10?
I was aware of BGFXās existence to the extent that I knew that LWJGL had a binding for it, but I wasnāt entirely sure what its purpose was just from checking the Javadocs in LWJGL. Iāve looked it up a bit now and it does indeed seem to have the same purpose as the abstraction Iāve started on. I canāt make any conclusive statements about it; Iāll need to look into it more, but it doesnāt seem like a perfect match from what I can tell.
So far, the most glaring missing feature is uniform buffers, instead using the old uniform system which has much worse performance than uniform buffers. For example, with forward+ shading Iāll be uploading a large amount of light data to uniform buffers which are then read in every single rendering shader. With uniform buffers, I can lay out and upload the data much more efficiently using buffers, bind it once and leave it bound for the entire render pass. Without them, Iāll have to reupload all the uniforms for each shader that needs them with one call for each variable.
Secondly, itās unclear to what extent BGFX supports multithreading. It claims that the rendering commands are submitted from a dedicated rendering thread (so am I) to improve performance, but this is essentially what the OpenGL driver already does. I canāt tell from the research I did if itās possible to construct efficient command buffers from multiple threads with BGFX. I suspect it does support that, but if it doesnāt thatād obviously hold back performance quite a bit.
The biggest problem is just the entire approach that BGFX has. The level of abstraction is at the level of OpenGL, meaning that a lot of the gritty details are hidden. When I started learning Vulkan and trying to find common points between Vulkan and OpenGL, I quickly came to the conclusion that emulating OpenGL on Vulkan is completely pointless. If you try to abstract away the gritty details like image layouts, descriptor sets, multithreading, uniform buffers, etc like OpenGL does, youāre setting yourself up for essentially writing a complete OpenGL driver. The problem is that itās going to be a generic OpenGL driver that has to manage everything in suboptimal ways, instead of an actual dedicated driver for the specific hardware you have, and seriously, youāre not gonna be able to write a better driver than Nvidia for example.
The key thing I noticed was that itās a really bad idea to emulate OpenGL on top of Vulkan, but emulating Vulkan on top of OpenGL is actually super easy. Emulating descriptor sets on OpenGL is super easy and you can keep a lot of the benefits of descriptor sets. Image layout transitions can just be completely ignored by the abstraction (the OpenGL driver will of course handle that for us under the hood). We can emulate command buffers to at least get some benefit by optimizing them on the thread that compiles them. In other words, writing a Vulkan implementation that just delegates to OpenGL is easy. Writing an OpenGL implementation that delegates to Vulkan, oh, have fun. Hence, if you expose an API that requires the user to provide all the data needed to run the API on Vulkan/DirectX12/Metal, adding support for anything older like OpenGL is trivial.
[quote=ātheagentd,post:5851,topic:49634ā]
See #1231.
[quote=ātheagentd,post:5851,topic:49634ā]
The build that ships with LWJGL supports up to 8 threads (the default) submitting draw calls. See the encoder API (bgfx_begin, bgfx_encoder_*).
[quote=ātheagentd,post:5851,topic:49634ā]
Apparently, after MoltenVK, Khronos will be working on Vulkan emulation libraries on top of Direct3D 12 and OpenGL. So, if anyoneās planning to learn Vulkan seriously, eventually it will be a pretty good investment. If you donāt have the time for that and just want to get robust results quickly, bgfx is a very good choice for targeting GL/D3D12/Metal.
Imagine you want to draw 300 3D models in your game. The vertex and index data are all in shared buffers and they all use the same shader and same textures, but a single uniform vec3 for the position offset of each model is changed inbetween each call to place each model at the right position. You also have a large number of uniforms for controlling lighting (essentially arrays of light colors, positions, radii, shadow map matrices, etc), but these are of course set up once. You essentially have code that looks like this:
This will perform so bad itās not even funny, and itās easy to explain why. GPUs do not have support for āuniform variablesā. They source uniform data from buffers, either in RAM or VRAM. This means that the GPU will create a uniform buffer layout for us based on the defined uniforms in our shader and place our data in a buffer. Great, no problem. We end up with a buffer that has the light colors, positions, radii and shadow matricesā¦ and then the position offset. Then we change the position offset inbetween each draw call. The problem is that the GPU hasnāt actually executed those commands yet, so we canāt write over our previous uniform data. Because of that, the driver will create a full copy of the entire uniform buffer, including all the light data, giving each draw call its own version of the uniform buffer, with only the position offset actually differing between them. This leads to a lot of hidden memory usage within the driver, horrendous CPU performance (spiking on either glUniform3f() as it allocates a copy each time, or on buffer swapping if a multithreaded driver is used). The fact that all the uniform variables are placed in the same buffer makes it impossible to change a few of them without making an entire copy.
This is what the Nvidia driver does, and the exact problem I got working on Voxoid for Casā¦ except in my case, I already had my massive lighting data in a (manually created separate) uniform buffer. The only thing being copied around where the per-scene/per-camera attributes like fog uniforms and camera matrices, and even then, that copying was enough to completely kill CPU performance. The above example would probably drop below 60 FPS around ~100 draw calls or less.
Sure, bgfx could try to add some heuristics to this whole thing to try to figure out the update rate of different uniforms and assign them to different uniform buffers that can be swapped individuallyā¦ aaaaand before you know it you have an inefficient as f**k behemoth that uses heuristics for which uniforms to place in which uniform buffers, more heuristics to figure out whether to place the buffer in RAM or VRAM, and even more heuristics detect arrays of the same size and group them into structs for memory locality, and now the user has to train the heuristics to not bork out and max out the RAM of your computer for anything more complex that the most trivial possible use of uniforms, the whole engine crashes when it tries to use more than the maximum number of uniform buffers, etc. Youāve just created an unholy mess that has much worse performance due to the overhead of running the heuristics calculations even for well-written code, and you got crazy amounts of bugs and spaghetti interaction between seemingly unrelated things. Hmm, what does that remind me ofā¦? Right, OpenGL. None of these are exaggerations either, by the way. Do anything at all unconventional with buffer object mapping/updating on the Nvidia driver and youāll start to see all kinds of cool colors and shapes floating in front of you as you feel the sweet LSD overdose kick in which you took trying to deal with the fact that your buffer object test program performs completely differently depending on which order you test the different methods in, and one order semi-reliably causes a massive memory leak in the driver, but I digress. (disclaimer: donāt do drugs)
All this because to avoid having the user having to lay out the data and upload it to a buffer him/herself.
EDIT: In the above case, I solved it by completely getting rid of the glUniform3f() call by placing the data in a per-instance vertex attribute buffer, and using base instance to pick a value for each draw call. This would of course have solved the issue for bgfx as well in this case, but even then, the uniform buffer approach is vastly superior. If each of those models would need a different shader, youād have to recall the uniform setters after each shader switch, and OpenGL/bgfx would still need to duplicate the light data for each shader. With uniform buffers, you could just leave the uniform buffer (or the Vulkan descriptor set) bound and it all just works out with absolutely no additional cost.
EDIT2: An additional benefit of uniform buffers is the ability to do arrays-of-structs instead of structs-of-arrays, which should have better memory locality.
[quote=ātheagentd,post:5853,topic:49634ā]
This is a very specific rendering scenario, itās anything but typical. Iād never expect bgfx to be useful to you, but not everyone does deferred/forward+ shading with hundreds of lights.
[quote=ātheagentd,post:5853,topic:49634ā]
I donāt have a reason to dispute that, sounds reasonable. But also like something that a clever driver would trivially optimize (does it really have to use a single uniform buffer for everything internally?). Could you post a source that verifies the above?
[quote=ātheagentd,post:5853,topic:49634ā]
Yeah, bgfx doesnāt do anything like that internally, itās a fairly simple abstraction over the low-level rendering APIs. I wouldnāt be surprised if graphics drivers used such heuristics though (see above).
[quote=ātheagentd,post:5853,topic:49634ā]
Yes, thatās what bgfx does for instancing. Thatās also what I was doing ~10 years ago, back then UBOs/TBOs were generally a bad idea on all drivers/GPUs iirc.
Anyway, bgfxās developer has acknowledged that better support for UBOs is necessary and itās coming soon. There will be support for uniforms that are updated at different granularities (per-frame, per-view, per-draw), with the appropriate backend-specific implementations. Also, this all applies to Linux only, you wouldnāt use the OpenGL backend on Windows/macOS (D3D11/12 & Metal respectively).
[quote=āabcdef,post:5856,topic:49634ā]
bgfx defaults to using Direct3D 11 on Windows, but you can force it to use one of the D3D12/D3D9/GL/GLES backends. Vulkan will also be supported in the future. What I mean is that, given the choice, thereās no good reason to prefer OpenGL on Windows over D3D/Vulkan.
Itās hard to give you access to all the code but I can tell you the general principals
The triangulation for the filling of the shapes was based on āSweep-line algorithm for constrained Delaunay triangulationā by V. Domiter and and B. Zalik
The stroke drawing was just basic LINE_LOOP of the various contours provided
The drawing of shapes was done using a similar api that we go uses but adapted to be easier to use in my opinion. you have a shape which can have many closed contours, you can line to somewhere, beside to somewhere, create circles, ovals, rectangles, rounded rectangles etc.
This data feeds the algorithm and you get a list of triangles finding out which uses all your point (nomore are created).
when you get the triangles I find a bounding box then calculate texture coordinates for all points and then use this for filling with either a texture, a single colour or a gradient colour.
I donāt have anti aliasing yet but will add later one I have built all my ui components
my ui components can be drawn with the vector api
Thatās basically it, it looks pretty cool and you have lots of holes in a shape and it textures perfectly around it