Optimisation of continuous terrain engine

StrideColossus · May 3, 2013, 10:55am

I now have a working terrain application which currently consists of a single terrain mesh and I’m thinking through the design for extending this to a continuous terrain engine.

The structure of each terrain ‘chunk’ is as follows:

512 x 512 grid of vertices stored in an interleaved VBO.
Each vertex consists of a 3-float vertex position (X-Z grid, height in Y direction), 3-float normal and 2-float texture coordinate.
Index buffer specifying the terrain grid as a single triangle-strip with degenerate triangles on the end of each row.
Single VAO covering the terrain segment.
Vertex and fragment shaders for rendering, fog and lighting.
One-or-more textures used in the fragment shader to render sand, grass, rock, snow, etc. depending on height or slope.

The application loads and creates terrain chunks on the fly on a background thread and adds them to the scene as needed depending on where the camera is in the ‘world’.

Now this works fine for the relatively trivial case of a single chunk of terrain, but will clearly not scale up to a terrain extending to the ‘horizon’ consisting of (perhaps) around a couple of hundred chunks. The first naive implementation simply uses multiple instances of these chunks, but even with a handful it’s clear that rendering performance will be poor.

So I need to start optimising the design and would appreciate any thoughts or ideas from anyone that have tried similar projects before I dive in.

Some obvious optimisations spring to mind:

Collapse the terrain vertex data into ‘compound’ VBOs (or maybe even just one VBO). This would reduce the number of VBO/VAO state changes, at the cost of additional complexity. But might it also reduce rendering performance?
Consider using ‘smaller’ data types for the position, normal, texture coordinates, rather than the default 4-byte float - is this viable? logical?
Use smaller terrain chunk sizes? 512 x 512 equates to around 8Mb of data per chunk which I suspect is quite a lot?
Share the index buffer across all terrain chunks.
Create lower LOD terrain chunks for more distant terrain (in addition to rendering LOD) and replace these with higher LODs as required (and discard higher LOD chunks as they become redundant).
Investigate geometry and tesselation shaders as a possible means of generating the X-Z and texture coordinates using the vertex index?

However the biggest bottle-neck seems to me to be the amount of redundant data, namely the X-Z grid positions and texture coordinates. These are the same for every terrain chunk. DirectX supports (AFAIK) the idea of vertex streams, i.e. a mesh can be comprised of multiple VBOs, in this case we would have one for the static data that is the same for every chunk, and individual VBOs for the height-data. Does OpenGL have the same mechanism, because I can’t find it if it does?

Any opinions, thoughts, suggestions, etc. on the above is appreciated in advance

stride