Yet another particle engine update!

I did gpu sorting a while back and radix sort is the way to go, there is a open source OpenCL implementation which is really fast
Fastes was at the time the CUDA impl of the Nvidia SDK

But for so many particles try some iterativ sorting algorithm(bubble sort) and do only a few passes each frame. You won’t have perfect sorting with this at every frame, but it is a very good 99% solution.

Also, if your particles are static then you have only a fixed nummber of orderings (2D: 4, 3D:12)

Well, they’re separate problems. I currently compute depth the same way as the depth is usually stored, so depth indeed depends highly on the orientation of the screen. It’ll be easy to change it to use eye-space distance instead which would remain stable no matter the camera’s orientation. How I render them to make them look good is a different problem. =S

Link? =D

No idea if this is of any interest…but I was just glancing at this: http://timothylottes.blogspot.fr/2012/12/storing-objects-on-gpu.html

Ah, yes, that site. Funny that he posted that just when I did my sorting stuff. =S I’ve been monitoring his blog for TXAA info and he recently deleted everything concerning TXAA (10+ posts), so I was worried that he had been fired or something. ^^’ We’ll see what he comes up with…

he deleted that stuff wtf?

Sounds really cool. Any possibility to share sorting shader?
These raw peak performance particle engines are quite interesting but sadly hardly ever translate directly to game usage.

How about adding features like environment lighting via cubemaps, dynamic lighting, self shadowing, casting shadows, dynamic force fields. Then you get to point where number of particles that is feasible is much lower than you currently use and that change some things radically.
Also you can usually cull whole emitters first and sort particles locally. So instead of sorting all particles you first sort emitter and after that sort particles per emitter.

Currently trying to figure how to get environment+ dynamic lighting and simulating vortexes with couple thousand particles at mobile tittle so only gles2.0, shadows are no go but luckily games does not even need those.

Any interesting particle papers to share? Currently trying to get ideas from this http://www.bitsquid.se/presentations/practical-particle-lighting.pdf

My idea is to just dump all kinds of particles into a huge list and update them on the GPU. The update shader is an uber-shader which allows for lots of particle types, including emitter particles etc. Since the particles are sorted by distance, they will be somewhat grouped together by type since they’re emitted from the same place so the branching will be relatively cheap. It’s also worth noting that transform feedback has relatively high bandwidth cost for each vertex processed, so adding more work to the update shader won’t affect performance at all in my tests. If I’m right, multiple vertex streams will also allow me to do frustum culling (just 6 dot-products) for multiple lights in one pass and output the indices to other buffers in the same pass.

My current sorting algorithm is an abomination and I really need a more optimized one. To still do this with transform feedback (instead of for example OpenCL or CUDA) I really need support for multiple vertex streams = OGL4 to direct particles into buckets. I’m currently forced to do one pass over all visible particles for each bucket, which means that I have to do twice as many passes as the bit precision of the depth. With multiple vertex streams, I could sort 4 bits per pass using 16 buckets and reduce the number of passes from 48 to 6 for 24 bits of depth, or just 4 buckets and get it done in 12 passes (if VRAM is a limitation). Like I said before, transform feedback is very memory limited, so this is the main bottleneck at the moment.

I’ve taken a look at fourier opacity mapping, and it seems to be an excellent way of doing particle shadowing and self-shadowing. Performance seems good since the resolution of the map can be kept very low while still giving a very good look thanks to the blurry nature of particles. The particles also do not have to be sorted when rendering the opacity map. My only problem is that I have absolutely no idea how it works. That kind of math goes waaaaay over my head. It’s definitely somewhere in my todo list though.

Since my current particles are meant to simulate smoke I also had a go with fragment limitations. To get good looking smoke you need a lot of overdraw, and with the current 2 megapixel screens that becomes very expensive. Some games render the particles at half to reduce the number of pixels drastically. Using a special upsampling filter they can preserve sharp edges. Although the particles get slightly blurry, there’s not much of a different since particle effects are inherently blurry. The only artifact possible are single-pixel errors that won’t be visible at all. For 4 times as much overdraw I’d say it’s definitely worth it.

http://www.bungie.net/Inside/publications.aspx
At “Blowing S#!t Up the Bungie Way” paper they presents some nice gfx stuff that used at Halo3. There are nice tiling plate texture animation trick that brings more life to particles that can help you to reduce particle counts. Basically there is bigger tiling texture that have some shape in it an top of that they swim the actual particle texture by animation the uv’s. It’s seems so simple yet effective.

Another good trick is to use grayscale texture and palettize that with 1x256 texture. This can save some bandwith, reduce texture packing artefacts and give a lot more variation than simply using tint color. Addition to this technique would be pack albedo, spec mask and alhpa to one texture. This should give enought variation for textures that you could render liquid and gasses with same shader. Using world space normal maps also work like charm with cube mapping.

Quick dumb question: why exactly are you sorting your particles?

For correct blending. I have to sort them or the blending won’t be applied in the correct order. I can’t use normal z-buffering either.

Just noticed that this stuff is suppose to be coming up at:
http://www.geforce.com/landing-page/txaa
https://developer.nvidia.com/content/welcome-game-graphics-technology-blog

from comments found here: http://timothylottes.blogspot.fr/2013/01/toward-practical-real-time-photon.html