God rays

I OMGed. I came. I completely blew my own mind. I AM AWESOME.



That’s right! I just implemented real time ACCURATE god rays!!!

I was reading around on the internet about this effect, and many people seem to use a really bad approximation method using a radial blur. It looks good in the most important cases (for example in Crysis when you’re looking at the sun through palm leaves), but completely fails for overlapping geometry. Reading up on the technique I realized they were doing LOTS of samples to do the radial blur. So I just thought “why not use those samples for ray tracing instead?”. So I did. The image above is generated in realtime, using 64 samples, but 32 samples has only very limited banding and is probably the best option for performance.

This is a post processing algorithm, but it doesn’t run completely independent of the scene performance-wise. First I create a shadow map for the scene. Then I render the scene normally with lighting and shadows. Finally I raytrace through the shadow map to find out how much of this ray is shadowed. Only about 32 samples is needed to reduce banding to acceptable levels. Comparing to the 50+ samples used in the radial blur of the fake algorithm, this performs similarly. The advantages are correct lighting for overlapping light shafts/shadow shafts, it works independent of the screen orientation and also even for objects that aren’t on screen (VERY different results from the fake algorithm). It works with any shadow map! It looks AWESOME!

The FPS is kind of low, only about 40 FPS at 1080p on my GTX 460M. However, considering the blurry nature of the effect and the built-in “anti-aliasing”, it should be completely indistinguishable to render it at half resolution or even 1/4th or less. Increasing the samples and reducing resolution should together with bilinear interpolation reduce banding and increase performance. Shadow map resolution also plays a small role.

Sorry for my nerdgasm, but this is without a doubt THE most awesome thing I’ve ever programmed. I just HAD to make a dedicated thread for it!

Any more screen shots?

What about a test program? xD
I’m having some problems with the tracing going behind the shadow map, giving me a huge floating shadow of the cube behind the light when looking towards the light. I need to clamp the end point to the shadow map plane somehow… Jeez, I’m soooo bad at this kind of math…

Anyhow, screenshot dump commencing… Settings: 1024x1024 shadow map, 64 samples tracing, 1920x1080.



No cube on the screen, but still a shadow in the air!


Inside the shadow! The green tint is the weird blending of the yellow plane and the orange sun light. xD


The loaming cube of doom that floats in the air. If you’re good at this kind of math, I’d appreciate some help in getting rid of it. Shit, it reminds me of the Ghasts from Minecraft…

Oooh can I try this? I’m running at 1080p on GTX 570 :slight_smile:

It does look nice, good job 8)

Geh, when trying to solve the floating box of doom problem, I kinda gave up as I would have to remake lots of stuff… Anyway, I can just send you the test I used here. I changed the light color to white though… Give me an hour and I’ll have it ready.

EDIT: An abomination, but at least it starts (for me at least…): http://www.mediafire.com/?qpgtdo1iim5xh8z


  • OpenGL 3.3 compatible GFX card.
  • JRE 7.


  • Automatically fullscreens to the desktop resolution.
  • To change shadow map resolution, edit it in settings.txt
  • God ray samples is a little bit more tricky. Go to shaders/volumetric.frag and change the line <#define SAMPLES 64> to the number of samples you want. Over 256 is a complete waste of processing power. 128 should be high enough quality. 64 would be a “low” setting.
  • NO ERROR HANDLING! Will crash and burn in spectacularly silent ways if you do something wrong.
  • The floating cube of doom has been hacked away.

EDIT: You’ll need Fraps or something to see the FPS.

Gah, I’m not installing Java 7 yet. It doesn’t work with 6u27 and I’m not updating til 7u2. Any chance for us old timers?

You do know You can have several jre’s at the same time ? :slight_smile:

“You will disarm all of your weapons and escort us to Sector 001”

JRE 6 version: http://www.mediafire.com/?f22suczarua96ij

I have no idea what you’re talking about.

Yeah too lazy to do that ;D

EDIT: @theagentd
HOLY CRAP THAT IS AWESOME!! It ran really smooth on my computer, I wish I could know the FPS :confused:
But WOW, that looked awesome! Somehow, I got caught in the black horizon of doom and I couldn’t get back :stuck_out_tongue:

Hehe! Just my reaction, and I created it! The black horizon would be the shadow map near plane. xD Don’t wander needlessly in a buggy application! >_<

One thing I noticed was that there was a LOT of screen tearing! Also, is there no way I can see the FPS?

Screw you, making me work this hard on this! xD

Version 3: http://www.mediafire.com/?d8211zy2tgx3l1l

  • Fullscreen setting in settings.txt
  • Windowed screen size (ignored if fullscreen = 1)
  • VSync setting (you could’ve just forced this in your drivers… T__T)

FPS is System.out.println()ed to the console, run it with the bat file to open it with a console window. If you’re not on Windows, just start it with a terminal or something. (java -jar GodRays3.jar) I couldn’t get my ancient font renderer to work with OGL 3.3… T___T
Remember that you can change the god ray samples in the shader too!

EDIT: Forgot to rename the jar in the bat file. Change “DAM.jar” to “GodRays3.jar”… ._.

With VSync and FullScreen on and a shadowMapResolution of 2048 and 5096, I get a constant 61 FPS. It goes down to 41 FPS when I bring it up to 10192.

I hate my weak laptop!!! >_<

Do you mind testing some settings so I can predict the performance of it? It would be really nice of you!

  1. Change the number of god ray samples in shaders\volumetric.frag (the #define SAMPLES 64) to 128.
  2. Set shadow map resolution to 2048.
  3. Fullscreen on.
  4. VSync OFF.

That should be a common scenario for the highest graphics settings in a game. Please tell me the FPS. Sorry for failing with the FPS rendering… ._.

The algorithm benefits heavily from texture caching, so the angles between the camera frustum and the light frustum affects FPS a lot (at least on my computer). In a worst case scenario (90 degrees angle) it would actually need 128 texture lookups for 128 samples. As the angle gets smaller, the same depth buffer value will be tested multiple times against different values, resulting in a lower amount of actual lookups. Looking at objects close to the screen also puts the samples closer together, so it will be faster too.
Therefore, it would be nice if you could test some extreme scenarios too.

  • What was the highest FPS you got and how where you looking to get it?
  • What was the lowest FPS you got and how where you looking to get it?
  • What was the average FPS approximately?

Remember to turn off VSync, as it limits your FPS to your refresh ratio, and I think your graphics card can definitely handle more than 60 FPS. xD And if the FPS is disappointing (128 samples is pretty much), remember that it is very unoptimized, and that I can basically make it 4x as fast with a little work.

PS: If someone’s interested in how I’m doing this, I found this article which is very similar (except they draw geometry, and I do everything with a shader): http://developer.amd.com/media/gpu_assets/Mitchell_LightShafts.pdf
The funny thing is that I found this AFTER I implemented my idea. Now I’ve come up with the idea for both depth peeling AND this volumetric light algorithm by myself, only to realize they already exists. Frustrating. xD

I ran it for about 40 seconds:
Highest FPS: 133
Lowest FPS: 62
Average: 99.325 from 40 readings.

I couldn’t tell what I was looking at since it was in full screen mode but I just circled around the cube a bit and went in and out of the shadow.

EDIT: hehe I set the SAMPLES to 256 and it was running at a smooth 65-80 FPS, 512 brought it down to the 30-40. ;D

Thanks a lot for the test results! This is definitely gonna get implemented in my next project, so having some performance numbers is awesome!
As I’m rendering to a normal RGB8 render target here, using 512 samples would produce 512 different shades, but the render target can only handle 256 different shades. Complete waste of processing power. xD 256 samples is kind of overkill too, I doubt anyone will see the difference with HDR and bloom thrown in. Rendering to a lower resolution render target is probably the only way to get good frame rates. For me the scene takes about 2ms to render, including lighting and creating the shadow map. However, with volumetric lighting with the settings I listed in my last post I only get about 30 FPS. That means that my current implementation takes about 30ms, just for volumetric lighting. Completely unacceptable, of course. xD Your card is about 3-4x as fast as mine though… ._.
Again, thanks for testing it out!

Glad to help :slight_smile: