One does it generally for comparison, theadgentd, there are no other ways afaik to get an exact rendering
I decided to write a small test program to show off the visibility functions of AOIT and FOMOIT compared to a perfect visibility curve (= AOIT with infinite number of samples).
Gray: perfect visibility curve reference
Red: FOMOIT, 15 coefficients (what fits in 8xRGBA16F textures), 64 bytes per pixel
Green: AOIT, 16 nodes (packed into 4xRGBA32UI textures), 64 bytes per pixel
Note how FOMOIT actually follows the curve very closely. We do see some ringing at the sharp dip near the middle of the curve, but the biggest issue is that if the depth range of the scene is increased the curve would be stretched to cover the depth range, so errors get bigger.
Itās clear that AOIT has removed a large number of nodes and is underestimating the visibility slightly in many cases, but the overall quality of the curve is very high, and itās possible to increase the number of nodes all the way up to 32. AOIT isnāt actually completely order-independent, as nodes are merged and removed in the order theyāre rendered, so the exact shape of the curve DOES depend on the order the geometry is rendered at. In particular, geometry sorted perfectly from front to back seems to suffer from some big banding and precision issues at only 16 samples. Random order generally works the best here, something that could be a good thing to keep in mind.
Performance:
FOMOIT: 42ms
AOIT: 31ms
AOITās advantage in performance comes from bandwidth reductions. FOMOIT first writes to all 8 textures (64 bytes per fragment) for every single particle fragment to construct the visibility curve, then in the second pass the entire curve needs to be read back again from the textures for each particle fragment again. With so much data, the texture cache is thrashed pretty hard, so pretty much all of this data is reread for each fragment. Add a massive amount of overdraw and you get an insane amount of bandwidth used. AOITās first pass only writes 12 bytes of data (alpha+offset, then update linked list texture), but also does atomic operations. Still, the linked list writing is around 5x faster than writing to 64 bytes per pixel. The linked lists are only traversed once per pixel, not once per fragment (pixel*overdraw), so traversing the list and building the AOIT visibility curve is pretty fast too. The WBOIT pass at the end is almost identical performance-wise for both AOIT and FOMOIT as they read the same amount of data and do very little math.
Now, time to implement AOIT in a compute shader! =D
I test my algorithms on particles mainly, so I would simply sort all particles on the CPU.
Is it possible to take a look to your code?
Hah! Just managed to port AOIT to a single pass compute shader. In worst case scenarios itās almost twice as fast!!! So⦠No linked ists = bounded memory usage, no temporary textures so no memory overhead whatsoever for the entire algorithm, higher depth precision and higher performance, AND OGL3 support if ported to a fragment shader since it doesnāt use any compute shader specific features! If anyoneās sitting on an old OGL3 GPU, please contact me so I can have you test it!!!
I will try to release a couple of things pretty soon. Hopefully within a few days. x___x
EDIT: Ported the shader to a fragment shader. Had to emulate some GLSL packing instructions and use a uniform buffer instead of a shader storage buffer. I will probably use texture buffers to not be limited to a certain number of particles in the future. Anyway, the shader runs on OpenGL 3.1 (.1 needed for uniform buffers) and Iāve confirmed it works correctly on an Intel HD2500, which technically supports OGL4, but not compute shaders, atomic operations or image-load-store. The only catch is that it took a solid 300ms per frame when zoomed in on a cloud of particles⦠Still decently fast when compared to FOMOIT or stochastic OIT. A fun thing was that uniform buffers turned out to be faster than shader storage buffers by a solid 20% or so.
EDIT ": Itās worth noting that a HD2500 barely manages 100 GFLOPS, while just one of my desktop GTX 770 manages 3500 GLFOPS, so the numbers arenāt unreasonable considering how unoptimized the whole thing is.
As of today, I have officially begun the development of a game (collaboration with two others) with my game engine. Wish me luck guys
Itās normal, UBOs are, at worst, as fast as the SSBOs, given the features read only, fixed and smaller size, incoherent access and (usually) local memory locability
Added a third dimension of depth to the pseudo-3D orthographic thingy. I feel like this could be used in a Super Paper Mario-esque way where the scene appears flat but you can go into ā3Dā for a while.
Hey Coldstream! Could you provide a wee bit more information about that game engine of yours? Is it 2D, 3D? Pure Java?
@J0: itās a 2D Java game engine built from the ground up, integrating jBox2D as a physics engine.
I had thought about putting a thread in the WIP board about it, but then I saw the stance on engines and thought Iād better wait until I had something with a little more game in it. When I do get around to making a post there Iāll go far further into specifics.
working on a strategy game, just with Swing.
My aim is to build a prototype, some mechanics. Almost forgot about how easy is to build an user interface with swing these days!
Been posting a lot of the orthographic 3D thingy lately but Iām pretty happy with this one. I redid (some of) the cover screenshot from Mind the Gap with the new engine, with a fancy transition between the old view and the new view.
Before and after:
Iāve really realized how⦠bad my art was two years ago.
Shit I still have this game in my Downloads file :point:
Do you think you could make the coins rotate?
(Edit: wanted to post something here, realised Iām already the last post, so Iāll just add that down here)
People! I think I discovered how to counter being bored with a project! You do know that moment when youāre not that much involved in a personal project for a few days before getting back to it, right? (You know it, donāt deny that).
Well! I have found the ultimate solution. drum rolls Have two ongoing projects at once! :
No, seriously, it works! For the time being, whenever I donāt feel like coding or making the art for Alba ā my game ā, I write my novel! Then as soon as I have enough of doing that, I find new inspiration for Alba ;D This is effing magic.
J0
I was thinking of doing that, but because of how this isnāt true 3D and is just the illusion of depth (stretched quads), rotations other than right angles are really difficult.
Iāll try to figure this out and will edit this post with an animation if I manage to get it working.
Kinda working prototype here. The points basically are moving in a diamond shape inside the blockās square top, so itās not a perfect circle (although I think spherical interpolation would help more). The quad also flips after the points cross each other, but Iām okay with that.
i updated my interwebsite to use html-5.
@basil_ The pictures up there all look gorgeous. What did you use to put together the GUI in the pictures under /lws_shots?
link?
itās http://memleaks.net/ ⦠but the html5 part is more ironic.
@boxsmith : thanks! itās plain swing with a modded https://github.com/Insubstantial look-and-feel using lwjgl 2.9.x and a gl-canvas.
I thought it was an error