Smart & Faster cos/sin (no lookup table)

DzzD · May 9, 2011, 9:15pm

public class MathX
{
     
    private static double f2=-0.5;
    private static double f4=-f2/(3.0*4.0);
    private static double f6=-f4/(5.0*6.0);
    private static double f8=-f6/(7.0*8.0);
    private static double f10=-f8/(9.0*10.0);
    private static double f12=-f10/(11.0*12.0);
    private static double f14=-f12/(13.0*14.0);
    private static double f16=-f14/(15.0*16.0);
    private static double f18=-f16/(17.0*18.0);
    private static double f20=-f18/(19.0*20.0);
    private static double PI=Math.PI;
    private static double PI2=2.0*PI;
    private static double PI05=0.5*PI;
 
    /**
    * Compute and return sinus of its parameter using taylor serie
    * @param x angle in radian to 
    * @return sinus value for the given parameter
    */
    public static double sin(double x)
    {
        return cos(x-PI05);
    }
 
    /**
    * Compute and return cosinus of its parameter using taylor serie
    * @param x angle in radian to 
    * @return cosinus value for the given parameter
    */ 
    public static double cos(double x)
    {
        if(x<0.0) x=-x;
        if(x<PI2) 
        {
            if(x<PI)
            {   
                double x2=x*x;
                return 1.0+x2*(f2+x2*(f4+x2*(f6+x2*(f8+x2*(f10+x2*(f12+x2*(f14+x2*(f16+x2*(f18+x2*f20)))))))));     
            }
            else
            {
                x-=PI;
                double x2=x*x;
                return -(1.0+x2*(f2+x2*(f4+x2*(f6+x2*(f8+x2*(f10+x2*(f12+x2*(f14+x2*(f16+x2*(f18+x2*f20))))))))));      
            }
        }
        x%=PI2;
        x-=PI;
        double x2=x*x;
        return -(1.0+x2*(f2+x2*(f4+x2*(f6+x2*(f8+x2*(f10+x2*(f12+x2*(f14+x2*(f16+x2*(f18+x2*f20))))))))));
    }   
         
}

Roquen · May 10, 2011, 7:31pm

Power series are interesting in theory and for knowing the rate of converagance, but not really useful for direct implement. Wikipedia has a basic list of sound methods here. It’s best to use a computer algebra system or a specialized DSL to construct approximations (without too much hair loss).

DzzD · May 12, 2011, 12:01pm

in this particular case (cos/sin) implementation become very interresting in term of speed, as the final expression can be factorized a lot it requiere few ops (few multiplications & additions) and is computed very fast, knowing the input range of value, one can even sometime inline it, with enought factor it is still faster than original Math.cos without noticable errors.

I’ve always used Mathcad (so long time that I am still using V 5.0 and just see that nowadays it is up to V15 !), it is a very nice software and easy to use

Roquen · May 12, 2011, 3:01pm

The big problems with truncated power series are:

The are centered around a point (where something like remez will be over the desired range)
The constants are reals, which are rounded to a float and these errors compound over the polynomial evaluation. Numeric methods take into account FP usage.

As a simple example, here’s code for a 5th order remez based vs. 5th order truncated power series:


  // On [-Pi/2, Pi/2]
  // max abs error = 1.588p-13     @ 1.6a91e2p0  (~.000164)
  // max rel error = 1.61c6558p-13 @ 1.86ec88p-1 (~.000169)
  // f(0)    = 0
  // f(Pi/2) = 1 - 0x1.38p-19 (~.999998)
  public static float sin_n0(float x)
  {
    float x2 = x*x;
    
    return x * (1 + x2 * (x2 * 0.00761f - 0.16605f));
  }
  
  // On [-Pi/2, Pi/2]
  // max abs error = (at Pi/2)
  // max rel error = (at Pi/2)
  // f(0)    = 0
  // f(Pi/2) = 0x1.01288ap0 (~1.004525)
  public static float sin_r0(float x)
  {
    float x2 = x*x;
    
    return x * (1 + x2 * (x2 * (1.f/120.f) - (1.f/6.f)));
  }

DzzD · May 12, 2011, 3:09pm

[quote]1) The are centered around a point (where something like remez will be over the desired range)
[/quote]
yes, that’s why I mentioned that cosinus is a special case : symetrical and periodic (never very far of central point)

DzzD · May 12, 2011, 3:22pm

[quote]2) The constants are reals, which are rounded to a float and these errors compound over the polynomial evaluation. Numeric methods take into account FP usage.
[/quote]
hehe that’s where double become your friend… and for an unknow reason you’re forcing float WTF ? : 32 bit more precision are not negligeable in such computation, just try

NB: division is very slow in comparaison of multiplication/addition/substraction, using constant is not an option for evident speed reason

EDIT : Let me explain …

static double f1= 1.0/120.0;
static double f2= 1.0/6.0;
public static double sin_r0d(double x)
{
double x2 = x*x;
return x * (1.0 + x2 * (x2 * f1 - f2));
}

public static double sin_r0d2(double x)
{
return x+Math.pow(x,5)/120.0-Math.pow(x,3)/6.0;
}

public static double sin_r0d3(double x)  
{    
	double x2 = x*x;        
	return x * (1 + x2 * (x2 * (1.0/120.0) - (1.0/6.0)));  
}

Output :
sin_n0(Math.PI0.5)=0.9999977
sin_r0(Math.PI0.5)=1.0045248
sin_r0d(Math.PI0.5)=1.0045248555348174
sin_r0d2(Math.PI0.5)=1.0045248555348174
sin_r0d3(Math.PI*0.5)=1.0045248555348174

error stay very low on all double computation and for your sample code all double computation results are identical and more accurate

Roquen · May 12, 2011, 4:26pm

I chose a low order polynomial and 32-bit floats for comparison purposes. The same issues hold if you move to doubles. Unless my subtraction skills are on the frizt today, you’re helping me prove my point.

Your double version is off by ~0x1.29p-8 (~.0045).
My float version is off by 0x1.38p-19 (~.000003)

So I have over twice as many valid digits, even though I have less of them.

DzzD · May 12, 2011, 4:49pm

not requiered to proff anything, ouf … I will not have to type any code this time !

you’re measuring error against the wrong equation, we are computing x+(x^5)/120-(x^3)/6 not sinus, sinus have nothing to do there, it is only a hazard the error introduced by float low precision make the polynomial closer to sin, but still farther from x+(x^5)/120-(x^3)/6 , expected result is more near to 1.0045… than 1.00

Riven · May 12, 2011, 5:09pm

Can you two get a room? :point:

For a lot of cases an error margin of 0.004 is perfectly acceptable, as we’re probably not able to see the cumulative divergence any time soon, and once it’s there, it might actually not be that important for the gameplay. It’s always a tradeoff: if you want accuracy, simply use Math.sin(…) directly.

Roquen · May 12, 2011, 6:00pm

Hehe: Wrong error (points to the title of the thread).

@Riven. Room? You have no idea of domestic train/plane prices.

The complexity (assuming same precision) is the same, so why not get more “bang for your buck”??