Quick 'n Dirty Verlet Fluids v0.1

Riven · August 9, 2009, 12:31pm

The collision is both squares and spheres.

Every tick() this happens:
[x] a simple (empty!) rectangular grid (cellSize is determined) is created/reused
[x] the grid is filled, by taking the center of each drop, calculating the cell[ x ][ y ] where the drop should be added
[x] another cell2[ w ][ h ] is filled, and for each cell, it adds the current cell and all neighbouring cells
==> now cell2[ x ][ y ] has roughly ~9 times the cell[ x ][ y ] drop count
==> for each cell[ x ][ y ], we now have all potential colliding drops in cell2[ x ][ y ]
[x] traverse the grid, and for each cell[ x ][ y ] collide each drop with every drop in cell2[ x ][ y ], except itself
==> only this part is multi-threaded
[x] clear all drops from both cell[ x ][ y ] and cell2[ x ][ y ]

The grid.cellSize is adjusted automatically (in steps of 10%), by measuring its own performance. It basically does some trial-and-error and after N iterations, approaches an optimum (hopefully not a local optimum) and kinda stays there. When the location of large sums of particles will change, it is likely that the grid will also converge to another optimal cellSize.

CommanderKeith · August 9, 2009, 12:59pm

Interesting, so how big is the grid, 1 cell per pixel? It appears that the droplets compress and end up in the same cell so I guess the cells are smaller than a pixel? How do you chuck the big boat in there which is obviously bigger than a cell?

Riven · August 9, 2009, 1:17pm

Well, in each cell2[ x ][ y ], the following drops will be added:
cell[ x - 1 ][ y - 1] cell[ x + 0 ][ y - 1] cell[ x + 1 ][ y - 1] cell[ x - 1 ][ y + 0] cell[ x + 0 ][ y + 0] cell[ x + 1 ][ y + 0] cell[ x - 1 ][ y + 1] cell[ x + 0 ][ y + 1] cell[ x + 1 ][ y + 1]

The drops in cell[ x ][ y ] can only collide with the drops in cell2[ x ][ y ], so naturally, the cellSize must be greater than the dropRadius, because otherwise the potential colliders wouldn’t be in cell2[ x ][ y ]. The current dropRadius is 2,75. Typically, the cellSize ends up at 12…16 pixels (on my CPU, in this scene), which basically means that in cell2[ x ][ y ], the drops are gathered that are found in a (12…16)*3 = 36…48 pixel square.

Regarding the boat, it is not interacting with the liquid grid (cells) directly. I calculate the bounding sphere (rendered as a grey circle) of the boat. Then I query the grid for all cells that intersect that bounding sphere. I sorta took a (slower) shortcut, by calculating the boundingbox of the boundingsphere (is too big, I know), calculated the minimal x,y and maximal x,y cell coordinates in the grid, corresponding with that boundingbox, then subtract -1,-1 cell from the minimum, and add 1,1 to the maximum cell (adding the neighbouring cells), and grab all drops in that rectangular group of cells. Then I collide (pure Verlet) all drops (basically spheres) with all spheres in the boat.

markus.borbely · August 9, 2009, 6:35pm

I have a 5 year old laptop. It got an Intel celeron M 1.4 GHz CPU.
Framerate: 10-12

I also got a year old stationary. It got an intel core 2 duo 3.16 GHz.
Framerate: 35-39

Spasi · August 10, 2009, 9:29am

Win7, JRE v1.6.0_15, Intel Q6600 (quad core, 2.4GHz), getting 50fps @ ~60% CPU usage.

swpalmer · August 11, 2009, 1:26am

OpenSolaris 2009.06 64-bit, Intel Core i7 (4 cores with Hyperthreading = 8 CPU), nVidia GTS 250 GPU

50fps at about 21% CPU

Thanks for the WebStart link

Bonbon-Chan · August 11, 2009, 8:59am

Between 50-35 fps average around 42

Core 2 Dui E7400 @ 2,8Ghz, Windows XP

ManaSink · August 11, 2009, 2:38pm

About 40 - 45fps
CPU% = ~ 120% (linux style, looks like about 60% per core)

Linux CentOS 5.1
Intel® Core™2 Duo CPU E6550 @ 2.33GHz
NVIDIA GeForce 7600 GS
Java™ SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot™ 64-Bit Server VM (build 11.2-b01, mixed mode)

Demonpants · August 11, 2009, 2:39pm

Intel Core 2 Duo 2.4 ghz, around 30 fps, about 95% of cpu. At one point I randomly was getting 2 or 3 fps, I think it was when the raft was buried.

ewjordan · August 12, 2009, 1:06am

Looking very good, I’m impressed you’re getting that many particles going at once without too much slowdown.

Going to release code, too?

fermixx · August 12, 2009, 4:05am

asus laptop
intel core2Duo (2 cores) t7250 @ 2ghz
2gb ram
windows vista home premium

fps:
average = 33
minimum = 28

the weird thing is that even with these fps, i dont see it smooth. it looks laggy even if its always +28fps
my gpu is not top notch (ati hd 2400). cant play crysis but i can play oblivion and gears of war at decent fps (+30)

CPU : 70~90 % (thats a lot dude!!)

hope it helps

Riven · August 12, 2009, 11:44am

Well, it’s certainly a lot! There are 8K particles, so in theory, that’s 64 million collisions per frame. Due to the spatial algorithm, that gets reduced to roughly 1% (still 640K collisions or collision checks). You won’t see true fluid simulation in any modern game, because it’s just too darn heavy to compute, even simulating a glass of water is harder to realistically calculate (not to mention render) than Crisis.

I’m probably going to find some more shortcuts, maybe I can speed it up like 2-5 times, but it won’t get much better than that. At that time, I’m probably going to make the drops smaller anyway.

The gfx card is not used BTW, maybe by Java2D, but I’m mainly drawing rects and there isn’t exactly much room for improvement there, and the GPU won’t really accelerate that that much. It’s like OpenGL immediate mode with 1 quad in every begin/end at best, as every quad has it’s own color.

Riven · August 12, 2009, 1:14pm

Maybe, the code is very simple actually, it’s just not really nicely coded.

Yesterday I optimized a fair bit - without touching the algorithms, and it seems to be 20% faster. Will upload later.

Riven · August 12, 2009, 6:16pm

for those interested, this is the whacky ‘kernel’:


 public static final void collideLiquid(VerletSphere a, VerletSphere b, float viscosity)
   {
      final float rad = a.radius; // 'b' is just as big...
      final Vec3 aNow = a.p.now;
      final Vec3 bNow = b.p.now;

      final float dx = bNow.x - aNow.x;
      final float dy = bNow.y - aNow.y;
      final float dz = bNow.z - aNow.z;
      final float d2 = dx * dx + dy * dy + dz * dz;

      final float outer = (rad * 3.0f);
      if (d2 > outer * outer)
      {
         return;
      }

      float force = viscosity / d2; // decreases like gravity

      final float rad2_doub = rad * rad * 4.00f; // (rad*2)^2
      final float rad2_half = rad * rad * 0.25f; // (rad/2)^2

      if (d2 < rad2_doub) // d < rad*2
      {
         // prevent extreme collisions
         local: if (d2 > rad2_half) // d > rad/2
         {
            float diam = rad * 2.0f;
            if (d2 > (diam * diam))
               break local;

            float d = (float) Math.sqrt(d2);
            float f = (d - diam) / diam * 0.25f; // spring with half stiffness = 0.5*0.5

            aNow.x += dx * f;
            aNow.y += dy * f;
            aNow.z += dz * f;

            bNow.x -= dx * f;
            bNow.y -= dy * f;
            bNow.z -= dz * f;
         }

         // this is a collision, flip the viscosity force (makes no sense, but it works)
         force *= -0.5f;
      }
      else
      {
         // these drops are so far away, that viscosity shouldn't really have such a big influence...
         force *= 0.5f;
      }

      final Vec3 aOld = a.p.old;
      final Vec3 bOld = b.p.old;

      aOld.x -= dx * force;
      aOld.y -= dy * force;
      aOld.z -= dz * force;

      bOld.x += dx * force;
      bOld.y += dy * force;
      bOld.z += dz * force;
   }

Hansdampf · August 12, 2009, 6:29pm

I implemented something similiar and sped it up by

precalculating the inverse radius
using a lookup table for sqrt (interesting values are always in radius range)
could help you too

Riven · August 12, 2009, 6:44pm

well, those are probably going to gain me a few percent, if any at all. not really worth the effort at the moment, as it’s not the bottleneck.

the spatial algorithms take 25%, the above code takes 45%, everything else takes the remainder.

Demonpants · August 12, 2009, 9:13pm

I actually didn’t know you could use break and block labels like that in Java. Good to know.

Riven · August 12, 2009, 9:23pm

This is the 2nd time I did that in over a decade. It stinks. Don’t do it.

Abuse · August 12, 2009, 10:24pm

what’s the min. required screen resolution?
Tried to give it a spin on my netbook, but can’t see much of the window area. (1024x600)

Of what I could see, it ran @ 8-10fps.

Riven · August 12, 2009, 10:40pm

1024x768, but it is really meant for 1280x1024.

At the moment I don’t have the time to change much. I removed the raft for some obscure reason, so if I’d upload it now it would be kind of a boring demo, for everybody else