JOML 1.8.0 Release

To note. It does work pretty well with an absolute max. error of <= 1.97E-8 within the range -800PI…+800PI (and beyond) compared to java.lang.Math.sin(), and it is faster. That’s all I care.

Like I said. They have no clue what they’re talking about.

@theagentd: You would want to walk the other way…https://en.wikipedia.org/wiki/Horner's_method

Is that a graph of the error compared to Math.sin of the different approximations ‘order-9, order-11’, etc?
What is the ‘n’ on the y-axis?

junk=approximation from linked thread, order-9,11,15 are 9,11 and 15th degree polynomial approximations from the software package I provided a link to. These were designed to minimize relative error and not required to be correct at sin(+/- pi/2) or any other constraints. Call each of the approximations p(x) then the plot is abs. error expressed as p(x)-sin(x) where sin was computed at 165 bits. The ‘n’ is nano. I should have change that output but it didn’t seem worth the time. So the 15th degree poly from the linked thread has lower error than a proper 9-th degree poly for about 1/2 the domain. At all other error measures it’s worse, peak, average, RMS. Average error is directly related to the unsigned area under the error curve.

Thanks, interesting to read about that software package Sollya. It looks difficult to use for non mathematicians, I couldn’t see any sin approximation function there ready-made. I assume that you used it to make one yourself somehow for those nth degree polynomials?

Yeah I just did a REPL session.

I tossed together a script example: http://pastebin.java-gaming.org/aee29618e4d11
The output of the script looks like this: http://pastebin.java-gaming.org/ee2917e8d411b

Impressive, thank you. I’ll try to run your code and do a test to see how much faster this approximation is.
By the way, what is your job or study area?

No…I just need to approximate a function once in a blue moon.

And since I forgot above, here’s the Robin Green papers. These assume minimal background in any of the topics.

Use Horner’s form: Less instruction issues and more accurate. If you wanted more speed (call-site dependent) you could convert to Estrin’s method at a very small accuracy reduction. Rewritten in Horner’s:


private static double sin_theagentd_arith(double x)
{
  double xi = Math.floor((x + Math.PI/4.0) * 1.0/Math.PI);
  double x_ = x - xi * Math.PI;
  double sign = ((int)xi & 1) * -2 + 1;
  double x2 = x_ * x_;

// double sin = x_;
// double tx = x_ * x2;
// sin += tx * c1; tx *= x2;
// sin += tx * c2; tx *= x2;
// sin += tx * c3; tx *= x2;
// sin += tx * c4; tx *= x2;
// sin += tx * c5; tx *= x2;
// sin += tx * c6; tx *= x2;
// sin += tx * c7;
// return sign * sin;

  double sin;
  
  x_  = sign*x_;
  sin =          c7;
  sin = sin*x2 + c6;
  sin = sin*x2 + c5;
  sin = sin*x2 + c4;
  sin = sin*x2 + c3;
  sin = sin*x2 + c2;
  sin = sin*x2 + c1;
  return x_ + x_*x2*sin;
}

Tested Roquen’s version for error:

TheAgentD:
1.2978632316690104E-16
Roquen:
1.2978632799710485E-16

Precision is mostly unaffected. I’ll leave the benchmarking to KaiHH. =P


private static void brutex()
{
  float v = .1f;
  double e = 0.f;
  double ev = v;
  do {
    double r1 = sin_theagentd_arith(v);
    double r2 = Math.sin(v);
    double d  = Math.abs(r1-r2);
    
    if (d > e) {
      e  = d;
      ev = v;
    }
    
    v = Math.nextUp(v);
  } while(v <= PI);
  System.out.println(e + " @ " + Double.toHexString(ev) + " :" + sin_theagentd_arith(ev) + " " + Math.sin(ev));
}

5.881720110956223E-9 @ 0x1.2d97c6p1 :0.7071069396761204 0.7071069455578405
5.881719777889316E-9 @ 0x1.2d97c6p1 :0.7071069396761207 0.7071069455578405

Since there are no comments. Doesn’t anyone notice anything “odd” about these results?

Is it that the error ‘e’ actually got smaller on the second print rather than bigger, so 5.881719777889316E-9 < 5.881720110956223E-9 ?

Thinking two things really: the actual error is more than theagentd found AND they are both at the same input value.

Hi

KaiHH, I’m almost sure it won’t compile with Java 1.9:

Have you ever tried to compile JOML with an early build of Java 1.9?

Yes, I did. It compiles fine on 1.9 Build 140. Java 9 will not remove sun.misc.Unsafe. Probably 10 will.
And even if it did, JOML can still for a very long time build the CI builds on JDK8, and the MemUtilUnsafe class will not be loaded at runtime if it was detected that sun.misc.Unsafe is not available. So, running JOML on any Java 1.4 compatible JVM is fine, even if it does not support sun.misc.Unsafe.

http://pastebin.java-gaming.org/e291e8d814b10

`
func @ : rel-error : ulp-dif bits
orig @ 2.356194 ( 0x1.2d97c6p1) : 5.881720e-09 (0x1.9430508p-28) : 52977825 27
horner @ 2.356194 ( 0x1.2d97c6p1) : 5.881720e-09 ( 0x1.94304fp-28) : 52977822 27
range @ 1.570795 ( 0x1.921f9cp0) : 6.017409e-12 ( 0x1.a77p-38) : 54200 37
sollya @ 1.274948 ( 0x1.4662fap0) : 4.440892e-16 ( 0x1.0p-51) : 4 50

reduce @ 0.651724 ( 0x1.4daecp-1) : 3.338118e-09 ( 0x1.cac992p-29) : 30067090 28
`

Very interesting.

I made some small additions to Roquen’s code to do a little micro-benchmark of the different Math.sin approximation methods referenced in this and other threads between zero and +Math.PI. Thanks to Roquen, Kai, theagentd and Riven for contributing their different fast sin methods.

Here are the micro-benchmark results:


Math.sin nanos per operation: 92.64692352
original nanos per operation: 15.45626018
horner nanos per operation: 12.8026165
range nanos per operation: 9.38279252
newk nanos per operation: 9.22229814
sin_9 nanos per operation: 7.35209479
sin_theagentd_lookup nanos per operation: 9.73315193
sinHalf nanos per operation: 7.10601857
sinFull nanos per operation: 6.19737146

I have little confidence in my own benchmarking abilities, but I did notice that these results are similar to Riven’s given here (http://www.java-gaming.org/topics/are-sin-cos-lookup-tables-still-relevant-today-with-regards-to-performance/29853/msg/274983/view.html#msg274983). Riven’s sinHalf and sinFull methods take about 7% of the time needed for the Math.sin function in his test and in mine here.
Riven’s SinHalf lookup table method is the fastest. Roquen’s interesting sin_9 method is only marginally slower but I suspect it has far greater accuracy. The speed of Roquen’s newk method given it’s level of accuracy is pretty incredible.

I wanted to include the error table that Roquen showed and include Riven’s methods too, but I did something wrong with measuring the error of Riven’s look up tables since it was negative, so I won’t post that error output until I have time to investigate.

Here is the code that I threw together to do these benchmarks. Apologies for the poor organisation. You may have to run it with the -Xmx6000M VM option to avoid running out of memory or else change the line “int numTestValues = 100000000;” to something smaller.

http://www.java-gaming.org/?action=pastebin&id=1489

A big thank you to Roquen, Kai, theagentd and Riven for their code.

I forgot to note in the pastebin code that brute force error checking isn’t needed. Say for polynomial methods you can compute where the error should be zero and verify that within a couple of ULPs of these points that the sign of the error does change. Likewise compute where the error should be maximum and find the peaks. Much faster and exact error measures.

EDIT: Oh and that code/output should be printing ‘abs error’ and not ‘rel error’.

Is it possible to make an atan2 approximation using the sollya method you used to make this great sin approximation?

I see that atan2 is not continuous and has vertical asymptotes so would an approximation using this polynomial-fitting method be very good?