The past days I’ve been researching and implementing algorithms for efficient chunk/voxel generation and rendering, including:
- Iterating chunks from front to back based on distance to camera
- Computing “connectivity” between a chunk’s faces (can a face X see a face Y?) with flood filling starting from one of three faces and seeing whether the flood reaches any of the other three faces. This will be used for CPU-side occlusion culling. Article about the algorithm: https://tomcc.github.io/2014/08/31/visibility-1.html
- “Dense” Octree based on “mipmapped” arrays for storing chunks
- Greedy meshing
I’ve read a few articles about chunk/voxel management and rendering, and concluded with the following design:
- Use a “rolling”/sliding 3D array representing the maximum visible world
- Each array item stores a tree of chunks as a dense “array-mipmapped” octree (so, no pointers but just array index computations, because we expect all of the octree leaf nodes to be filled eventually, either with a non-empty or empty chunk)
- This octree stores 1 chunk per leaf node
- The octree is used for combined frustum and occlusion culling as well as determining the next chunk to generate
Especially number 4. requires some thought: What we want is a very efficient algorithm to drive chunk generation. This needs to cooperate with CPU-side frustum and occlusion culling in order to avoid generating chunks which we know will not be visible. It also needs to generate chunks in front-to-back order starting from the camera’s position to generate the best potential occluders first.
About point 1.: The purpose of this rolling array is for the managed chunks to “move” along with the player, so we can always generate chunks around the player. Another alternative that has also been proposed is a simple hashmap, hashed by world position. I’ll go with a rolling array for now.
About the octree: We mainly need a datastructure with spatial hierarchy to accelerate frustum and occlusion culling and efficient k-nearest neighbor queries to determine the chunks to generate and render next. We could also just use an array/grid here, because chunks are basically everywhere and k-nearest neighbor in this case will simply be iterating in a spiral around the player, but: This is only true for initial chunk generation. When a chunk has been generated (and is possibly empty because it only consists of air) we can and should use hierarchical culling instead.
Also, since the CPU-based occlusion culling with the “is face X visible from face Y” algorithm is very conservative and good for an initial estimate of the potentially visible set of chunks. We would really like to combine that with GPU-side Z-buffer hierarchical occlusion culling/queries.
The hierarchy of the rolling array and the contained octrees is also necessary because I’ve not found an efficient way to “slide” an octree along with the player, except removing and re-inserting chunks.
There’s still a lot to do, such as a potentially visible set algorithm for the CPU-side occlusion culling algorithm, which goes like this: Whenever it is determined that a chunk face entered by the view direction exits another face in this same chunk, we want to “narrow” the possible chunks being visited after that by the frustum made by the chunk faces with the current camera frustum. A glimpse of this idea is also demonstrated here: https://tomcc.github.io/frustum_clamping.html
EDIT: What this little Javascript demo doesn’t show, however (because a single frustum is immediately updated after placing a block), is that we don’t have just a single frustum which we narrow and filter chunks by, but we need actually multiple such frusta. Imagine the initial view/camera frustum and one completely opaque chunk rendered in front of the camera blocking the view in the center, whereas on the left and right side of the view, we can see further away into the world. In this case we actually have two frusta by which we filter further visited chunks. Since this can amount to possibly thousands of sub-frusta, an optimal solution to this problem would be an interval tree. We simply compute the left and right end of each frustum along the screen-space X and Y directions by determining the distance of the view frustum’s planes to a chunk. This can either narrow down a single interval or split an interval into two intervals, if the chunk is opaque and culls everything behind it.
EDIT2: I think, I will go with a software rasterization approach for occlusion culling instead.
Here is a simple depth-only rasterizer without vertex attribute interpolation, which I am going to use for CPU-side occlusion culling: https://github.com/LWJGL/lwjgl3-demos/blob/master/src/org/lwjgl/demo/util/Rasterizer.java
It is tailored for my vertex format (unsigned 8-bit vertex positions and 16-bit indices).
Here are two images (the left/first is rasterized with OpenGL showing linear depth, the right/second image is showing the linear depth rasterized with the software rasterizer):
(the difference of both images is exactly black/zero at the actual rasterized triangles)