Immediate mode rendering is dead

Riven · February 4, 2010, 3:45pm

I know it’s not correct. It returns the wrong result for every negative floatingpoint representation of an integer. Fixing that makes it much slower.


// This method is a *lot* faster than using (int)Math.floor(x)
private static int fastfloor(float x) {
  int i = (int)x;
  return x>=0.0f ? i : ((float)i==x ? i : i-1);
}

princec · February 4, 2010, 4:06pm

Bah. not so useful then.

Cas

Demonpants · February 4, 2010, 4:38pm

Well even the safe second one that Riven posted should probably be faster.

Riven · February 4, 2010, 5:00pm

Depends what the floor() is for… maybe you don’t feed it negative numbers? Then you can only cast to int. Or if it means your sprite is rendered 1 pixel off, nobody will see it.

princec · February 4, 2010, 5:17pm

I specifically used floor() because I had to deal with negatives And surprisingly, 1 pixel off is highly noticeable in the world of 2D pixel-retro graphics

Cas

Riven · February 4, 2010, 5:29pm

So how does the safe version of fastfloor() compare to StrictMath.floor() ?

Riven · February 4, 2010, 5:35pm

or maybe…


   private static final int   BIG_ENOUGH_INT   = 64 * 1024;
   private static final float BIG_ENOUGH_FLOAT = BIG_ENOUGH_INT;

   public static int fastFloor(float x)
   {
      return (int) (x + BIG_ENOUGH_FLOAT) - BIG_ENOUGH_INT;
   }

   public static void main(String[] args)
   {
         System.out.println(fastFloor(-1.1f));
         System.out.println(fastFloor(-1.0f));
         System.out.println(fastFloor(-0.9f));

         System.out.println(fastFloor(+0.9f));
         System.out.println(fastFloor(+1.0f));
         System.out.println(fastFloor(+1.1f));
   }
}

-2 -1 -1 0 1 1

Riven · February 4, 2010, 5:46pm

32768 runs:

output[k] = (int) StrictMath.floor(input[k]);

1.290ms

output[k] = (int) Math.floor(input[k]);

1.310ms

output[k] = fastFloor(input[k]);

0.094ms

~13 faster.

princec · February 5, 2010, 10:32am

Seems good enough to me I’ve hacked it into LWJGL’s Color class (HSB conversion method pinched from AWT) - will probably commit that to LWJGL as it is perfectly adequate for our purposes. Our particles interpolate HSB over time, causing a surprising number of calls to Math.floor(). The other call to floor() that took a lot of time was a rather foolish bit of laziness on my part - the results are now cached and only updated in certain circumstances. End result: a big speed up.

Cas

Riven · February 5, 2010, 10:36am

Yay. How much?

Markus_Persson · February 5, 2010, 10:37am

Depending on your usage, the fastFloor that turns -1.0 into -2 isn’t really that horrible.
It turns -0.99999999 to -1, and -1.00000001 to -2, so the error zone is infinitesimal. It happens to be exactly the values you get when sticking in integers into the fastFloor method, which can cause some problems, but if you’re purely in float land, you might never notice the error.

I use it (and have for a while) in minecraft to deal with the player leaving the map to negative coordinates. Sure, it means there’s an EXTREMELY THIN SLIVER of all blocks out there that get calculated as belonging to the wrong blocks, but I can live with that.
The error margin in my fast sqrt is worse.

But what riven just posted is even better as long as you know the size of your negative domain
It does lose some precision in the positive extremes, so don’t make BIG_ENOUGH_INT TOO large, or this happens:

    private static final int BIG_ENOUGH_INT = Integer.MIN_VALUE / 4;
    private static final float BIG_ENOUGH_FLOAT = BIG_ENOUGH_INT;

    public static int fastFloor(float x) {
        return (int) (x + BIG_ENOUGH_FLOAT) - BIG_ENOUGH_INT;
    }

    public static void main(String[] args)
    {
        int from = Integer.MAX_VALUE/4;
        int rounded = fastFloor(from);
        System.out.println("Expected "+from+", got "+rounded);
    }

Expected 536870911, got 536870912

princec · February 5, 2010, 10:49am

Funny thing is I only had to use floor() because of a bit of an edge case - literally: the gidrahs start off just outside the edge of the tile map, and because of a quirk of rounding when just casting to (int), I’d get an OOBE. floor() rounded their coordinates correctly in this case - the other 99.999% of the time they’re moved away from their spawn point and it wasn’t a problem.

@Riven - it may sound piddly but about 5% of my total execution time was spent in Math.floor()! When you’re doing everything you can to just achieve 60fps, that’s a significant saving in just one method call. I’ve made even bigger gains elsewhere caching colours - nearly 8-9% of my CPU time was just calculating colours every frame - so now I don’t ;))

Cas

Markus_Persson · February 5, 2010, 10:50am

With BIG_ENOUGH_INT=1024, this happens:
fastFloor(8.99995f) = 9

The ternary one’s error window (that I claimed was infinitesimal in my last post ;)) increases rapidly with smaller negative values, but it’s not as extreme:

Mth.floor(-20000.0009f) = -20000 (wrong)
Mth.floor(-20000.0010f) = -20001 (correct)

Riven · February 5, 2010, 11:33am

By changing BIG_ENOUGH_FLOAT to a double, it seems ‘good enough’ for any float input…


   private static final int    BIG_ENOUGH_INT   = 16*1024; // this is rather big...
   private static final double BIG_ENOUGH_FLOOR = BIG_ENOUGH_INT;
   private static final double BIG_ENOUGH_CEIL  = BIG_ENOUGH_INT + 0.5;

   public static int fastFloor(float x)
   {
      return (int) (x + BIG_ENOUGH_FLOOR) - BIG_ENOUGH_INT;
   }

   public static int fastCeil(float x)
   {
      return (int) (x + BIG_ENOUGH_CEIL) - BIG_ENOUGH_INT;
   }


      System.out.println(fastFloor(+8.99995f)); // 8
      System.out.println(fastFloor(+9.00000f)); // 9
      System.out.println(fastFloor(+9.00005f)); // 9
      
      System.out.println(fastFloor(-8.99995f)); // -9
      System.out.println(fastFloor(-9.00000f)); // -9
      System.out.println(fastFloor(-9.00005f)); // -10

Performance with float BIG_ENOUGH: 96us
Performance with double BIG_ENOUGH: 146us

Takes ~50% longer (~33% slower), but still beats the crap out of StrictMath.floor() (1290us)

pjt33 · February 5, 2010, 11:58am

Markus, the BIG_ENOUGH_INT in your example there is strongly negative, and it needs to be positive for the method to work even in theory and ignoring the mantissa size of floats.

Note that the AWT code is buggy. I filed a report and a fix in 2002*, but it’s too low priority. I suggest you patch it before committing to LWJGL.

http://bugs.sun.com/view_bug.do?bug_id=4759386

Riven · February 5, 2010, 12:11pm

This is way more accuracy than float can handle:

// just printing the values:
System.out.println(-20000.0009f); // -20000.0
System.out.println(-20000.0010f); // -20000.002

princec · February 5, 2010, 12:20pm

Good call on the hue bug.

Cas

Riven · February 5, 2010, 12:24pm

Why not use HSL? It’s so much better.

Markus_Persson · February 5, 2010, 12:57pm

I typed MIN_VALUE, didn’t I?

facepalm

Either way, the example with BIG_ENOUGH_INT at 1024 still shows the problem.

Riven · February 5, 2010, 1:28pm

with doubles as static fields and floats as parameters, the problem is solved.