This. The texture is both small (1 byte per texel for a total of 2.6666MBs) and has extremely good cache locality thanks to mipmaps. I intended to use compression as well to halve the memory usage, but it turns out itās not supported for 3D textures.
This isnāt true either. Texture sample dependencies are hardly a bottleneck at all in modern GPUs. Theyāre almost never too costly to be a good reason to work around having them, and you wonāt ever get very deep texture dependencies in the first place. Latency hiding works really well.
Letās assume that texture sample dependencies were slow. Usually shaders get compiled to something like this:
- sample all textures
- wait for all texture samples to arrive
- execute shader
The point of latency hiding is (as you know) to allow the shader to queue up multiple texture sample requests and execute other invocations of the shader during 2. A shader with a texture dependency simply introduces another sample-wait-execute combo:
- sample all independent textures
- wait for texture samples to arrive
- execute shader as far as we can
- sample all dependent textures
- wait for texture samples to arrive
- execute the rest of the shader
Itās easy to see that if latency hiding wasnāt there, the shader would become half as fast since the two waits would completely bottleneck the shader, but this doesnāt happen at all. For example, I used dependent texture reads in my single-quad tile renderer, which had a tile ID texture which was used to sample the tilemap texture. Performance wasnāt noticeably lower than simply rendering a simple textured fullscreen quad.