What are those missing optimizations, then? ???
And what do you mean by “conversion” or “set-up” time? ???
(dragging information out of your nose again )
The id paper is doing two forward trig…one is enough. It’ll be in my derivation when I complete it (the info is in recent posts). Conversion and/or set-up time is the application’s problem…you don’t have to worry about it.
(EDIT: The split the id version is using is odd as well. It’s so small it’d be better to remove it and add a small number to prevent the NaN)
I’m on vacation, and the laptop I was borrowing had its charger block cable bit off by a rabbit. If I find a new one tomorrow I’ll see what I can do. What exactly are you interested in? Timings of different slerp implementations?
This is true, and most of the time my animations would hit the fast nlerp path of LibGDX’s implementation. However, there is one big exception: animation transitions. WSW supports smoothly interpolating between two arbitrary animation frames. We use that to ensure smooth transitions between idle and walk animations, and during those smooth constant velocity motion is preferable.
I think Roquen was describing the same thing, but it should be possible to implement a binary search nlerp that calculates the midpoint between the two quaternions using nlerp repeatedly until the angle is small enough to produce accurate results with nlerp. Even a 179 degree slerp should be solvable with 5 passes, resulting in a ~12 degree lerp at the end, each iteration only requiring 4 additions and a normalization. I will try that tomorrow.
Yeah the angles, but evidence that small angles are naturally occurring to be more precise. I’ll note that significantly increasing the data set of animation data isn’t very important since the data’s going to move into cache at each processing step.
Also as mention earlier you could just integrate instead (if fixed time step & constant av per data point)
q = dq * q
but attempt to extend to second order isn’t worthwhile. Updating q as well dq when moving to the next data point is a good idea (compounding errors on q)
“However, there is one big exception:” This is what I was calling pose blending much earlier, and yes you can divide and conquer that problem.
Another option is to use a hybrid method (q0=start, q1=end):
lerp(q0, q1, f(t))
or:
result = s0(t)q0 + s1(t)q1.
The angular error between slerp and lerp for all angle rapid approaches a constant shaped function (just varies in scale).
I can get why people get annoyed when I’m terse…but in addition to the points I made above about the reader’s level of interest: There’s always too fucking much to be said.
The most important thing about all of this kind of nonsense boils down to how much you care about being able to match your desired angular velocity.
A slight cleaning pass on the raw data. Folded negative and positive (they represent the same bin) and reversed order to show from the smallest angle to largest.
the first three values are: 0.48974302, 0.77403533, 0.8223283. This means ~49% are in first bin, ~77.4 are in first two, ~82.2 in the first three, etc.
Huuuhhhhh… is that how it works??? Could you explain why? Not using proofs and weird mathematical annotation, but intuition?
The reason why it confuses me is that the dot product returns the cosine of the angle, between the quaternions, right? In that case, if dot=0, then the quaternions are “perpendicular” to each other, there’s a 90 degree angle between them. Going further to dot<0 would just mean that the angle increases even more, with dot=-1 the orientations pointing away from each other? How could those two be equal?
Googled a bit. So if I negate all quaternion components I end up with the same rotation as before. Does that not mean that the nlerp of JOML is flawed as it most certainly will take the “long” way around if one of the arguments is negated like that?
Reference implementation: recursive nlerp with fixed 128 iterations. Should be numerically correct. Way more precision than doubles can handle.
JOML: latest implementation from GitHub.
LibGDX: LibGDX (possibly outdated implementation) converted to use doubles.
nlerp: fixed(?) nlerp from JOML (note the “t1=” line).
Recursive nlerp: while(abs(dot) > 0.5) replace q1 or q2 with nlerp(q1, q2, 0.5) depending on alpha and update alpha. Finally nlerp(q1, q2, alpha).
Error is calculating as toDegrees(acos(abs(referenceSlerpResult.dot(algorithmSlerpResult)))), e.g. how many degrees the slerp result differs from the reference slerp implementation’s result.
As implemented (not the math):
slerp( a,b) = slerp( a, -b) = r
slerp(-a,b) = slerp(-a,-b) = -r
the results r and -r represent the same rotation.
dot(a,b) = dot(-a,-b) = d
dot(a,-b) = dot(-a,b) = -d
if the dot product is negative you’re really asking for the largest angle about the implied between the two input, but implementation swap it the to smallest automagically.
Note the error is 3D is going to be 2*acos(abs(dot(ref,approx))).
Let me try to answer your question in a different way. Most slerps do this (implicitly):
slerp(a,b,t) { d=dot(a, b); if (d<0) {d=-d; b=-b;} XXXX }
We have input a & b and assuming b != a && b != -a, then there is a plane that contains a,b and the origin. We define ‘a’ to be our reference direction so we turn the plane so it points straight to the right (becomes the X axis). We now define the Y axis to be the direction orthogonal to X that contains b. This means that ‘b’ is by definition above X (is positive in Y). If ‘b’ were in the second quadrant (say 3pi/4), we’d get a negative dot product which means it’s the longer path…-b is closer to X (-pi/4). So the entire domain is on [0,pi/2]. Negating ‘b’ will give us the opposite direction for Y in the example and in that plane it’s angle (with respect to X) is pi/4. Flipping the plane about it’s X axis and looking at the opposite side.
Or another way from the example. Negate a, now our new ‘a’ is point straight to the left. We flip about the Y axis and it’s point to the right again and 'b is in the first quadrant.
In this example in 3D the angle is pi/2 or 90 degrees, so if the dot is negative we’re asking the math for a 270 degree rotation.
btw I love Roquen’s detailing of stuff, and always stare intently at all of his cryptic messages… though I barely understand one symbol in 10. I stopped “doing” maths at around age 16 - it’s totally over my head. He is a treasure trove of computer science knowledge. We must pickle his brain before it is too late and extract what we can.
Awesome intuitive explanation, Roquen!
But there is one thing that I still do not quite understand, namely the fact that the maximum rotation angle is only PI/2. My current intuition tells me that it should be PI.
Because:…
If a quaternion were to actually represent a rotation about let’s say 110 degrees. First of all, this is possible, right? I mean, I can define a quaternion (possibly from a rotation matrix) that does represent a rotation about the Z axis of 110 degrees.
Now, my second quaternion is the identity and thus does not represent any rotation. How would it be then that I can just have 110 - 90 = 20 degrees of rotation in slerp? I mean, I would still have to interpolate a rotation about the whole 110 degrees arc. Or don’t I?
Yes, that’s what would be reasonable. I have however the impression that people just don’t read JavaDocs.
So, I would name the method accordingly long and print on sysout and syserr for the first time a warning, whether the user really wanted to call this method.
But yes, people should first express the need for approximation functions, then we can add them.
Since you provided some number of benchmarks with different threshold values of the dot product to compromise speed vs. accuracy, would it be good for you to have an additional overload (or differently named) slerp that provides that threshold as parameter? I think that was initially proposed by @Roquen earlier.
EDIT: Okay, I just implemented the overload of slerp() taking the “threshold” parameter of the dot product between ‘this’ and ‘target’ below which the method performs non-spherical linear interpolation as per nlerp():
The “old” existing method without that parameter uses the previous 1E-6 threshold, and makes it clear in the JavaDocs.
This now feels alot better, because it is made explicit that the method does this anyways and gives the user a lever to control accuracy vs. performance.
You are mistaken. I did not test the JOML slerp method with different thresholds. I tested my own recursive nlerp function which nlerps the source quaternions until they are above a certain dot treshold, at which point a simple nlerp is done between them. Each iteration halves the angle between the quaternions, so it can do it pretty quickly, averaging 1.5 to 4 nlerps per call for sub-degree precision.
Modifying the threshold of JOML’s slerp is not a good idea, as the accuracy test of LibGDX’s version shows. Not normalizing when falling back to lerp has a massive precision impact.
You should keep the original slerp with a fixed threshold, as it is a good reference implementation with excellent accuracy. Reducing the threshold without normalizing the result has a big precision impact, and normalizing the result costs some extra cycles.
For nlerp(), it’s good to have an overloaded version that accepts a precomputed dot-product, as that value is often available when working with nlerp().
Add rnlerp(), recursive nlerp(), which takes in threshold value and refines the result until the final nlerp is done between two quaternions over the given threshold.
Source code:
public Quaternion rlerp(Quaternion q, double alpha, double dotThreshold){
//Both these are eliminated by escape analysis
Quaternion q1 = new Quaternion().set(this); //This is faster than using <this> as temporary variable
Quaternion q2 = new Quaternion().set(q);
double dot = q1.dot(q2);
while(dot < dotThreshold && dot > -dotThreshold){
if(alpha < 0.5){
q2.nlerp(q1, 0.5, dot);
alpha = alpha * 2;
}else{
q1.nlerp(q2, 0.5, dot);
alpha = alpha * 2 - 1;
}
dot = q1.dot(q2);
}
return set(q1).nlerp(q2, alpha, dot);
}
You could replace the temporary quaternions with local variables. Using this as a temporary variable is slower as they’re writes to RAM and can’t be moved to registers, and we can’t (and shouldn’t) modify the input quaternion. This one generates no garbage as escape analysis replaces them with stack variables.
EDIT:
This recursive nlerp function is faster than slerp() up to threshold 0.99992, at which point it has almost exactly the same precision as slerp():
slerp(): average = 2.057235254503539E-7, max = 2.4148365394514667E-6
rlerp(): average = 3.688761597702983E-7, max = 2.4148365394514667E-6
I vote for completely replacing slerp() with rlerp(), as it is just a more tweakable version of slerp().
EDIT 2:
In addition, those performance values are when using completely random quaternions. As you can see from the data from WSW, almost all interpolation is done between dot>0.99, in which case rlerp() is just nlerp(). With less random quaternions rlerp() is 2.5x faster than slerp() while providing the same max error.