Fast/Efficient method to filter a numeric Value

Hi there.

I’ve come upon a situation where I need to filter a numeric value (of any type) so it falls into a predetermined set of output values.

For example, let’s say we have an integer whose range is 0-200:


final int MIN = 0;
final int MAX = 200;

And we want to, given an input in said range, filter it out so we get one of three values:


final int HIGH = 200;
final int MED = 100;
final int LOW = 0;

So, we would have a filter method doing something like this:


int filter(int input) { ... }

filter(10); 	// Output: 0
filter(30); 	// Output: 0
filter(110); // Output: 100
filter(180); // Output: 200

In other words:


int filter(int input)
{
	if(input < MAX/3) return LOW;

	if(input < 2*(MAX/3)) return MED;
   
	return HIGH;
}

So, my question would be… What is the most efficient way to do this type of operation? This code is meant to be executed many (as in once per pixel at least) times per frame. Is there some bit-shifting trickery that could be used? Some arithmetic magic?

Currently, my generalized (using doubles as the value type) implementation is as follows:


public static double filterValue(double value, int slices, double MIN, double MAX)
{
	if(value <= MIN) return MIN;
	if(value >= MAX) return MAX;
		
	double increment = MAX / slices;
		
	for(int i = 1; i < slices; i++)
	{
		if(value - (i*increment) < MIN)
		{
			return (i*increment);
		}
	}
	
	return MAX;
}

To me, it looks like there are way too many ifs, multiplications and divisions in there. :persecutioncomplex:

Comments like “don’t use doubles you dolt, do it with byte-shifted integers!” are the type of ideas I am looking for.

how about dividing “value” by “increment” and rounding it to the next integer?

e.g.

MAX = 200
value = 150
slices = 3
increment = MAX/slices = 200/3 = 66.67

result = value/increment = 150/66.67 = 2.25 (round to next integer = 3)

So you know that “value” of 150 belongs to the 3rd slice… something like that?

If your ‘slices’ are all the same size (steps of 100 in your example) then you could use simple integer maths to do a sort of stepped linear interpolation, something along these lines:


	static double filter( double value, int slices, double max, double min ) {
		final int range = (int) ( max - min );
		final int step = range / slices;
		final int scale = range / ( slices - 1 );
		return ( (int) value / step ) * scale;
	}

where ‘slices’ would be 3.

Note that the brackets and casts are important for this to work.

If your steps/slices are not the same size then you’ll have to use some sort of table approach.

Also, encapsulate the code into a class and pre-calculate the ‘fixed’ stuff as class members, i.e. work out the range, step and scale values once in the constructor to avoid having to calculate them on every call of the filter() method.

@StrideColossus: What is the purpose of the scale variable in your formula?

@Oskuro: In what format do you need your result to be?

static int filter( double value, int slices, double max, double min ) {
      final int range = (int) ( max - min );
      final int step = range / slices;
      return ( (int) value / step ) + 1;
   }

I understood your problem to be how to determine if a value should be a part of set #1 or set #2, #3… so on and so forth, so I felt like returning an int would suffice.

If you implement the filter method as you suggested (to return 1, 2, 3, etc) then ‘scale’ is redundant.

I thought the OP was asking for the results to be in the range 0/100/200. You could use your approach (without the +1 at the end) and multiply the result by 100, the results would be the same I think.

Yeah I think OP should clarify what output he’s looking for :slight_smile: But either way, I think what we both presented should cover it.

I’m looking for a generic approach, given that I might need different output types in different parts of the program. Just wondering if there was some smart way to do these calculations.

[quote=“StrideColossus,post:4,topic:41829”]
That is of course how it will end up in the final code. The generic approach I posted is there just to make it easier to understand what the general case of my question would look like (heck, could be generalized more by using a generic variable type).

As for the intervals (slices), no, they don’t necessarily have to be of equal size.

This is an (specific, for 4 equal intervals, and a range of 0.0 to 1.0) example of how I am actually using this in my code:


private static final double Filter_MIN 	= 0.0f;
private static final double Filter_MAX 	= 1.0f;
private static final double Filter_MEDLOW = 1.0f / 4;
private static final double Filter_MED 	= 1.0f / 2;
private static final double Filter_MEDHIGH = 3*(1.0f / 4);
public static double Filter_4(double value)
{
	if(value <= Filter_MIN) 		return Filter_MIN;
	if(value <= Filter_MEDLOW) 	return Filter_MEDLOW;
	if(value <= Filter_MED) 		return Filter_MED;
	if(value <= Filter_MEDHIGH) 	return Filter_MEDHIGH;
		
	return Filter_MAX;
}

Note that, in this case, the output values match the threshold values for each step, but this doesn’t need to be the case.

When you say generic do you mean Java generic? i.e. You’d want the same code but for integer, float, double, etc. as required?

If the intervals are not the same size then I think the answer is no, the approach of iterating through the intervals seems the simplest (and the most efficient).

I think the OP doesn’t mean Java generic, but ‘works for each primitive type’. In that case I’d just write the same implementation multiple times. (optimized for [icode]int, long, float, double, short[/icode] (but who needs short anyways ^^) )

Not to sidetrack the discussion, but I wouldn’t necessarily call this filtering. Whenever I have heard the word filter used in programming it has been to drop items from a collection according to some predicate. What you have is more of a map. You have an input set of values that maps to a smaller set of output values.

If you search java map range of values in Google, some interesting things come up. If the number of “buckets” is fairly small and there are no “holes” then the simplest would be to store two arrays. One which has the max range of each bucket, the other which has the output value.

Then you could do something like:


public float getValue(float inputValue) {

	// Return the value of the highest cell if above max range
	if (inputValue >= maxValue) return values[values.length-1];

	// Return the value of the lowest cell
	if (inputValue <= minValue) return values[0];

	for (int i=0; i < size; i++) {
		if (inputValue > limits[i]) return values[i];
	}

	throw Exception("Oops");
}

This doesn’t take advantage of equally spaced slices but it should be plenty fast. If you have a large number of buckets, it may be worth while doing some sort of binary search type set up to reduce the number of comparisons.

Generally speaking, I’m pretty sure that any conditional statement is going to take a lot longer to evaluate than a series of mathematical ops. So some version of @StrideColossus’s code but with the precalculated values seems the best to me.

The one thing I would say is that when dealing with problems of this type, it is best to take another look at whether you actually need it. Sometimes I get so worked up with solving a problem efficiently that I don’t realize that I caused it in the first place.


int filter(int input)
{
   int values[] = {0,100,200};
   return values[input/100];
}

Actually, I had to do something like this when I was coding a game. Though it is not really a filter, but it is more like deciding a certain output when you hit a range of values.

In my case, my range was from 0 - 1000 and my output values I wanted was 100, 200, 400, 800, etc. So, to generate a range, I actually used a one line for loop.


int i;//The index of the for loop
int j;//The values you want to get from a specific range
int rangeValue;//The value chosen for the range
for(i = 100, j = 0; i <= rangeValue; i*=2, j++);
System.out.println("For "+rangeValue+" : the output is "+j+".");

I did achieve similar results using the modulus operator “%” as well. But, let me take a whack at trying to get your code working efficiently as you would want in the original post.


public int filter( int input, int increment ){
       for(int i = 0, j = 0; i <= input; i+=increment, j++);
       return j*increment;
}

In this case, the j serves no purpose. But, with a bit of editing, you can have it work as an index number for an array, or anything else that you need to split the ranges up. It is a lot more processing work, but it is more flexible and gives you more control. (If not, there is always using modulus :stuck_out_tongue: ).

Premature optimization…? :persecutioncomplex:

@Everyone: Calling it a filter because it reminded me of an electrical input filter… I really suck at remembering the “proper” names of algorithms and data structures.

Indeed. Just specified that because maybe there are some tricks that only work with specific primitive types, and that’s something I want to know.

I’m really liking pitbuller’s suggestion, by the way, simple, rather elegant, and only performs a single arithmetic operation. It won’t work with differently sized intervals, though, but since the objective is to make an specific implementation for the task, rather than have a generic method, it can be tweaked.

Rewrote my current code like this:


private static final double filter_interval = 1.0f/4;
private static final double[] filter_values = { filter_interval*0, filter_interval*1, filter_interval*2, filter_interval*3, filter_interval*4};
	
public static double NotReallyAFilter(double value)
{
	return filter_values[(int)(value/filter_interval)];
}

Seems to work fine.

Sorry for being such an overoptimizer tonight but… I’m guessing the compiler is already doing this, but since the value array is static, I could simply inline the array indexing and get rid of the method calling overhead too. Hmmmm.

Oh, and thanks to everyone for the comments :slight_smile:

Or like this

public static final double MIN = 0
public static final double filter_interval = 1.0/4

public static double filterValue(double value) {
    return MIN + filter_interval * (int)((value - MIN)/filter_interval)
}

Yes, the JIT will do that.

If you really want to over-optimise the fully generic case, you should look at a custom class loader which generates a balanced decision tree… :wink:

[quote=“Oskuro,post:16,topic:41829”]
Seriously; this is the very definition of premature optimization. It leads to shitty code and poor programming practices. If you apply this kind of thought to all aspects of your game/software, it will be a bitch to debug later down the line, will take you several times longer to develop, and ultimately will perform no better.

“Method overhead” is extremely negligible. If you benchmark and find that the algorithm needs optimization, it will most likely not do anything to inline your methods except further obfuscate your program.

Are you using OpenGL? Why aren’t you using shaders? Are you using PBOs to allow for async transfer of pixel data? etc. These are the optimizations you should be considering. (If you’re using Java2D; then maybe you’re using the wrong library for the job.)

I know, hence why I mentioned I know it’s overkill.

Then again, my question is about how this operation, specifically, can be optimized, so I don’t think it is out of place, this being the performance sub-forum and all.