Optimized Interpolation Set?

Hia, for various purposes I’ve created an Interpolation Set class. A picture explains better than words, so let met start with that:

http://cal007300.student.utwente.nl/vincent/Interpolation.jpg

The Interpolation set contains a series of key/value floats. What I want to be able to do is to query the set for any key-value in the interpolation and get a corresponding result. So, in the example above, the data set contains the following key-value pairs:

[tr]
[td]Key[/td]
[td]Value[/td]
[/tr]
[tr]
[td]0[/td]
[td]160[/td]
[/tr]
[tr]
[td]0.4[/td]
[td]160[/td]
[/tr]
[tr]
[td]0.625[/td]
[td]98[/td]
[/tr]
[tr]
[td]1.0[/td]
[td]55[/td]
[/tr]

When I query this set for key-value 0.5, I want the set to return the interpolated value.

  • In this case, 0.5 is in between keys 0.4 and 0.625. To calculate the interpolated value
  • First I determine the distance between the keys. The distance between the keys in the set is (0.625-0.4) = 0.225.
  • The distance between the queried key and the lowest-nearest key (the first key to the left of the queried key…) is 0.5 - 0.4 = 0.1.
  • 0.1 is 44.44% (0.1 / 0.255) of the way of the distance between the two known keys.
  • So finally we have to calculate what 44.44% of the way of the distance between the values is. The distance between the values is 98-160 = -62
  • … 44% of -62 = -27.55.
  • The final value and the value I wanted to get from the set is the leftmost value (160) plus the interpolated value (-27.55) so finally : 160 + -27.55 = 132.45.

Now the question for this forum is: what would be an (the most?) optimized way of approaching this problem? attached to this message is one way of doing this. But I have the gut feeling that it can be done in a better way.

When quering for a key lower than the lowest key in the set, the set should just return the first value. In case you query for a key higher than the highest key in the set, just return the last value.

So, if the nearest key/value to the left is (x1,y1) and the nearest key/value to the right is (x2,y2) you want to calculate the weighted average, i.e.


y1 * (x2-x) + y2 * (x - x1)
--------------------------------
          x2 - x1

Something like that?

EDIT: where x is the query value

Yes, indeed.


160 * (0.625-0.5) + 98 * (0.5 - 0.4)
----------------------------------------------   = 132.45
0.625-0.4

Now the main problem is not actually this final step. The main problem is finding x1 and x2, when given x, in the set. It is not actually a problem per-se, but I need the most optimized way of setting up the data structure to contain the interpolation and to quickly find x1/x2.

Well you are already finding the right index by recursion. That method will be the bottleneck here. I have the feeling there are a too much checks. Then you enter the recursive method, your are checking if the values are between v[0] and v[v.length-1]. You do this check every recursion. That is not needed, as you’ve checked this before, and are only calling the method if you have valid input. So IMHO, without running the code, I’d say you can get rid of 2 if-statements.

And ehm… your normalize-methods don’t make much sense :slight_smile:

You have to find the min/max values, then


foreach(value)
   value = (value-min) / (max-min);

Your current code just scales them, not transforming them into the range 0…1 like the documentation says.

that’s on purpose. but perhaps I shouldn’t call it ‘normalize’… :wink:
basically I use it to scale things like color information. so, I might have an interpolation containing the values 124,2,54,230 for example. I scale them to 0.0f - 1.0f , knowing that the maximum value is 255.0f (which is not in the set). A true normalize method would pick 230 as the maximum value (and take into account the minumum value, 2), because that is the maximum value present in the set.

Ah, you’re working on that particle-engine.

You know, when you have less than like 8 keys/values (I don’t know the threshold), it’s actually much faster to do everything in a plain loop.


int index = -1;
for(...)
   if(searchVal >= v[i])
      index = i;
      break;

Yeah, I thought about that. :-\

However, I’ve been researching Binary Search Trees today as an alternative to the arrays of floats. And my new version has a 600% increase in speed. ;D

To be specific, I’ve implemented Splay Trees .

[quote]A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O(log(n)) amortized time. For many non-uniform sequences of operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown. The splay tree was invented by Daniel Sleator and Robert Tarjan.
[/quote]
http://en.wikipedia.org/wiki/Splay_tree

When I’ve finalized my testing and cleaned up & commented the code a bit, I will post the splaying version to this thread.