What I did today

You guys should put this in another thread, or this whole What I did today topic is basically just a slightly more persistent version of the Discord…

Cas :slight_smile:

[quote=“princec,post:6121,topic:49634”]
That’s fine with me.
Btw.: Implemented some form of “irradiance caching” (a separate compute shader is running continuously to sample over the hemisphere of all voxels’ faces to gather ambient sky light. The rendering compute shader then just interpolates between the stored irradiances of all four vertices of the face hit by the eye-ray. As for temporal stability and noise this is far better than pure path tracing and I will probably keep it. Next will be point/torch light sources also scattering light onto nearby voxels, which need to be captured and cached as well.

6J7dMfPErRQ

(at t=2s, t=50s and t=1m11s the irradiance cache is invalidated)

GyxVPSnDAqE

This is cool as hell Kai. Can you make the voxels smaller for a higher resolution world? If you can, you should outright make some kind of small game with this tech.

@Spasi pointed me to http://www.jcgt.org/published/0007/03/04/ which uses rasterization of screen-aligned rectangular sprites (using gl_PointSize in the vertex shader to control the size), which encompass a voxel in screen space (actually, the rectangle encompasses the sphere which itself encloses the voxel, so the rectangle is always bigger than what an actual screen-space bounding rectangle would be - but is easier to compute) and then subsequently do a ray-box intersection test in the fragment shader. This can be used for primary ray intersection with the scene. So I of course had to implement that as well (the GLSL source from the paper contained a few errors - maybe intentionally).

JhKmmSkl7IM

Screenshot of the sprites:

They claim it is faster than just using pure rasterization of an optimized mesh, though I am sceptical. The inherent problem with their approach is that early-z does not work, because they need to compute and write to gl_FragDepth and do discard; in the fragment shader. So there is a lot of needless ray-box intersections going on in the fragment shader. But standard rasterization acceleration techniques apply to their technique, such as frustum culling and more importantly: occlusion culling. The big advantage of their algorithm then becomes that one does not need to build an optimized mesh anymore. Since I already have a BVH tree I am going to use it for frustum culling and GPU occlusion culling with conditional rendering as described here: https://developer.nvidia.com/gpugems/GPUGems2/gpugems2_chapter06.html
This should give lightning fast primary ray intersections with even hundreds of millions of voxels.

EDIT: Just added CPU frustum culling with the BVH tree, which was really trivial, since the big advantage of sorting the voxels first (using z-order-curve/morton code) before building the BVH tree is that every BVH node contains a contiguous range of voxels inside of it and most importantly, also inside of every of its child/descendant nodes. So each node (including the leaf nodes) only stores a “first voxel” index and a “last voxel” index into the voxels list (which only resides in the GPU buffer at that moment). So, I can just select how many clusters I want to test for frustum culling by selecting the respective BVH tree depth, iterating over all nodes up to that depth and issuing glDrawArrays(…, first, last-first+1) calls for the visible clusters.
With that I just tried to render 300 million voxels, which also worked, until… I tried to look at all of them at once at which point the app froze and the driver reset. :smiley:

EDIT2: (25.09.) Implemented “tight fit” quad point sprites. Saves a lot of unnecessary ray/aabb intersections in the fragment shader:

(the yellow transparent quads are the sprites which generate fragments that are ultimately used to do a ray/aabb intersection test in the fragment shader).

Haven’t looked at paper yet. Saw this note from IQ: https://twitter.com/iquilezles/status/1042873159605608451

Yes Íñigo, you are right, that paper is riddled with errors… :slight_smile:

A few weeks ago, I’ve added a Javascript backend to my Commodore BASIC V2 cross compiler (https://github.com/EgonOlsen71/basicv2), so that it can not only compile to native 6502 machine language but to Javascript as well.
To render the output in the browser, I’ve added a C64 screen like console output today. It’s far from perfect, but it supports simple PRINT output from the compiled program as well as some control codes for colors and such.

I’ve used it to compile this mandelbrot program:


5 PRINT CHR$(147)
7 TI$="000000"
8 DIMX$(16),Z$(11),V,U,A,B,I,R,H,A$,E
10 X1=38:Y1=22:E=15:V=48:FORX=.TOE+1:X$(X)=CHR$(V-X):NEXT:I1=-1:I2=1:R1=-2:R2=1
30 H=(R2-R1)/X1:S2=(I2-I1)/Y1:YH=Y1/2
60 FORY=.TOYH:I=I1+Y*S2:R=R1:A$="":FORX=.TOX1:V=R:U=I
110 FORN=.TOE:A=V*V:B=U*U:IFA+B<=4THENU=U*V*2+I:V=A-B+R:NEXT
120 A$=A$+X$(N):R=R+H:NEXTX:PRINTA$:Z$(YH-Y)=A$:NEXT
130 FORY=1TOYH:PRINTZ$(Y):NEXT
10000 PRINT"TIME: ";TI$;

into this: https://jpct.de/js/index.html

and this prime number generator:


10 W=500:DIM F(500):P=1:A=3
20 PRINTA:F(P)=A:P=P+1:IF P>W THEN STOP
30 A=A+2:X=1
40 S=A/F(X):IF S=INT(S) GOTO 30
50 X=X+1:IF X<P AND F(X)*F(X)<=A GOTO 40
60 GOTO 20

into this: https://jpct.de/js/index2.html

The purpose of all this is…I’ve no idea, I guess I did it for the lulz.

Played with water today:

EgSzL17-rgU

Kai is out here putting Minecraft to shame with a nonchalant side-project lol

It’s scarily competent and I’m worried it’s even putting Dan’s state-of-the-art Voxoid engine to shame! Surely there’s gotta be a catch somewhere?

Cas :slight_smile:

Albeit being a side-project, till now I’ve invested a rather significant amount of spare time into it. However, 80% of the time was discovering papers/articles/blog posts, reading them, re-reading them, finding more articles and reading those. Then another 20% implementing a bit of it.
The last couple of weeks, computer graphics became a synonym for “reading papers” and trying them out. I’ve accumulated a rather large amount of tiny demos evaluating different technologies and comparing them with regards to performance and the TODO list of techniques applicable to this cube world rendering which I need to evaluate is growing rather than shrinking.
Just yesterday I was researching occlusion culling and found this excellent blog: https://interplayoflight.wordpress.com/2017/11/15/experiments-in-gpu-based-occlusion-culling/ which itself contains a dozen more links to articles/presentations to read and watch.

The catch is that I surely will never get this to a playable state. It’s a tech demo to satisfy my need to stay up-to-date and play with CG techniques. So I hugely respect and admire you guys for building and releasing actual games. :slight_smile:

Well that’s the funny thing, only last night we were lamenting that we’ve spent a whole year (albeit very part-time) building a voxel rendering engine in Java - and it does look absolutely marvellous, make no mistake - but it’s really complex and really compute intensive both on the GPU and CPU, and at the end of the day we can only just render a 4096x4096x256 voxel scene at 60fps on the latest hardware. Admittedly it’s bump mapped, has arbitrary lighting and shadows, transparency capable, with a particle engine and has fancy water too, but it suffers currently from being a static scene - that is the voxels themselves are not easily modifiable once they’re ready to render.

Cas :slight_smile:

To me it sounds that you actually do not have much to lament about. I mean, having all those fatures in your engine is great and seems a rather huge achievement to me. Also please note that my little project is not an engine. It is 700 lines of Java code in a single source file with a handful of GLSL shaders and nowhere near to be used by or useful to actual games/projects. The world size, camera settings and noise parameters are just magic numbers hardcoded somwhere in a method and the flexibility of that is absolutely zero.
Developing an actual engine with a usable interface and cramming all the features you mentioned into it is a whole other story.
That’s probably where the most effort for you was going into. You would be just as capable of build a simple small demo in a very short time as I am if it wasn’t for the aspect of: supporting a content creation pipeline, providing tooling/previews for artists, longevity, maintainability, reliability, flexibility that an engine usually brings about.

Today I created a boomerang sprite:

also created a shurikan, hammer and potion icons:

Looks like this ingame:

Got back into programming again after having no power all winter. Nearly finished with my terrain rendering.

Well, look at Atomontage - the guy basically did that at least the last 10 years… so one year is probalby not so bad :wink:

Looks good, but make it fly in a curve - it’s a boomerang :stuck_out_tongue: :wink:

I also found this some time ago after I implemented GPU based occlusion culling, which I think I shared here (https://www.youtube.com/watch?v=383EKvaU2vE). Very interesting read, his blog :slight_smile:

Nice to hear that your code is totally hacked sometimes, while your projects certainly don’t look like it is.

Appreciated :smiley: ! I’ll keep you updated :stuck_out_tongue:

Began adding Hierarchical Z for occlusion culling. First was creating the Hi-Z mipchain (first 5 levels):