Floating Point Versus Fixed Point

Not so long ago I made a statement about how I found using fixed point arithmetic to be faster when doing linear interpolation of audio in Java. This has been bugging me ever since, as in the floating-point code I hadn’t really tried to do anything fast. So I’ve conducted a little test to see if floating point arithmetic really is slower.

Here’s the code I used, it’s a simple class that generates a sine wave at the specified frequency (specified as input samples/output sample) by linear-interpolating a 1024 entry wavetable. It’s not the most efficient algorithm, as there is an end-of-waveform check in the inner loop which could be hoisted with some clever arithmetic, but it’s relatively easy to understand. There is an implementation of the algorithm for 15 bit fixed point, 32 bit float, and 64 bit double types.

I had stated that you need a modulo-divide to get the fractional part when using floating point arithmetic. Actually this is not true, you can just subtract the integer part instead.


public class Interpolator {
	private static final int
		FP_SHIFT = 15,
		FP_ONE = 1 << FP_SHIFT,
		FP_MASK = FP_ONE - 1,
		BUF_LEN = 1024,
		FP_BUF_LEN = BUF_LEN << FP_SHIFT;

	private short[] wav_buf;

	public Interpolator() {
		int idx;
		/* Initialise a sine wave as the input.*/
		wav_buf = new short[ BUF_LEN + 1 ];
		for( idx = 0; idx < BUF_LEN; idx++ ) {
			wav_buf[ idx ] = ( short ) ( Math.sin( Math.PI * 2 * idx / BUF_LEN ) * 32767 );
		}
		wav_buf[ 0 ] = wav_buf[ BUF_LEN ];
	}

	public double interpolate_double( int[] mix_buf, int length, double step, double phase ) {
		int idx, i;
		double c, m, x;
		for( idx = 0; idx < length; idx++ ) {
			i = ( int ) phase;
			c = wav_buf[ i ];
			m = wav_buf[ i + 1 ] - c;
			x = phase - i;
			mix_buf[ idx ] += ( int ) ( m * x + c );
			phase += step;
			if( phase >= BUF_LEN ) {
				phase -= BUF_LEN;
			}
		}
		return phase; 
	}

	public float interpolate_float( int[] mix_buf, int length, float step, float phase ) {
		int idx, i;
		float c, m, x;
		for( idx = 0; idx < length; idx++ ) {
			i = ( int ) phase;
			c = wav_buf[ i ];
			m = wav_buf[ i + 1 ] - c;
			x = phase - i;
			mix_buf[ idx ] += ( int ) ( m * x + c );
			phase += step;
			if( phase >= BUF_LEN ) {
				phase -= BUF_LEN;
			}
		}
		return phase; 
	}

	public int interpolate_int( int[] mix_buf, int length, int step, int phase ) {
		int idx, i, c, m, x;
		for( idx = 0; idx < length; idx++ ) {
			i = phase >> FP_SHIFT;
			c = wav_buf[ i ];
			m = wav_buf[ i + 1 ] - c;
			x = phase & FP_MASK;
			mix_buf[ idx ] += ( m * x >> FP_SHIFT ) + c;
			phase += step;
			if( phase >= FP_BUF_LEN ) {
				phase -= FP_BUF_LEN;
			}
		}
		return phase;
	}

	public static void main( String[] args ) throws Exception {
		int idx, num_cycles;
		int phase_i, pi_i;
		float phase_f, pi_f;
		double phase_d;
		long time;
		Interpolator interpolator;
		int[] mix_buf;

		num_cycles = 100000;

		interpolator = new Interpolator();
		mix_buf = new int[ BUF_LEN ];
		phase_d = phase_f = phase_i = 0;
		pi_i = ( int ) ( Math.PI * FP_ONE );
		pi_f = ( float ) Math.PI;

		time = System.currentTimeMillis();
		for( idx = 0; idx < num_cycles; idx++ ) {
			phase_i = interpolator.interpolate_int( mix_buf, BUF_LEN, pi_i, phase_i );
		}
		System.out.println( "INT: " + ( System.currentTimeMillis() - time ) );

		time = System.currentTimeMillis();
		for( idx = 0; idx < num_cycles; idx++ ) {
			phase_f = interpolator.interpolate_float( mix_buf, BUF_LEN, pi_f, phase_f );
		}
		System.out.println( "FLOAT: " + ( System.currentTimeMillis() - time ) );

		time = System.currentTimeMillis();
		for( idx = 0; idx < num_cycles; idx++ ) {
			phase_d = interpolator.interpolate_double( mix_buf, BUF_LEN, Math.PI, phase_d );
		}
		System.out.println( "DOUBLE: " + ( System.currentTimeMillis() - time ) );
	}
}

And here are the results from my 1.5Ghz AMD Sempron (basically an Athlon XP with the same amount of cache as an Athlon Thunderbird) using Java 1.6.0. Values are in milliseconds:

INT: 1453
FLOAT: 2641
DOUBLE: 5922

So it appears that, in Java at least, integer arithmetic is faster than floating point arithmetic on my machine. I’m a bit confused as to why 32 bit floating point is so much faster than 64 bit. I only counted 6 FP values, which should all fit in the 7 x87 registers. Perhaps there is still some unavoidable register swapping that could account for it. Or maybe there are still improvements to be made to the JVM’s floating point support :slight_smile:

Even so, the slowest result is still 6 seconds for over 100 MILLION samples of output. That’s enough for over 300 waveforms generated in real-time at 48khz. There are roughly 10 operations in the inner loop, which I work out to be about 170 megaflops. Some way off the numbers given for my machine (Sisoft Sandra suggests 2700 megaflops) but for real-world performance I’ve got no complaints :slight_smile:

Any thoughts?

Cheers,
Martin

Because the waveform was a sine wave in this test I thought it might be useful to see whether Math.sin() was a practical alternative, so I added the following code, which ought to be equivalent. The phase check is not really necessary (certainly not inside the loop), but in the interest of fairness I left it in:


	private static final double
		RECIP_BUF_LEN = 1.0 / BUF_LEN;

	public double sine_wave( int[] mix_buf, int length, double step, double phase ) {
		int idx;
		double two_pi;
		two_pi = Math.PI * 2;
		step *= two_pi * RECIP_BUF_LEN;
		for( idx = 0; idx < length; idx++ ) {
			mix_buf[ idx ] += ( int ) ( Math.sin( phase ) * 32767 );
			phase += step;
			if( phase > two_pi ) {
				phase -= two_pi;
			}
		}
		return phase;
	}

And the results:


INT: 1485
FLOAT: 2594
DOUBLE: 5812
SINE: 20125

So use a sine table :slight_smile:

[quote]So use a sine table :slight_smile:
[/quote]
heh, yeah I could’ve told you that without the test :wink:
I’m actually surprised using fixed point math and a table is ‘only’ about 13 times faster ;D

A lot of it has to do with converting ints to floats and floats to ints. For example, if wav_buf and mix_buf were both float[] arrays, the interpolate_float() function would be faster.

Here are the results on my machine - 2.16 GHz Intel Core Duo 2
Mac OS X, Java 5:
INT: 931
FLOAT: 2642
DOUBLE: 2649

Windows XP, Java 6 (under Parallels):
INT: 671
FLOAT: 771
DOUBLE: 831

I don’t know why the results are so different.

Edit: Results on Mac OS X + Java 5 with the -server switch:
INT: 678
FLOAT: 1081
DOUBLE: 1185

Results on Windows + Java 6 with the -server switch:
INT: 631
FLOAT: 1101
DOUBLE: 1212

I think that the OSX Java is built by Apple and not Sun which is why there are speed differences; but when you’ve delved into hardware-limited platforms these common speed practises become second nature. Look up tables, integer maths (and not losing accuracy from premature shifting) soon come naturally, you almost have to unlearn and rethink maths.

You may also find that it makes sense to have sin/cos look up tables with the length of powers of two’s; degrees and radians have no use to you in any way and when you think about it you’ll be much better off that way. If you are really thinking about speed then you may also want to consider cache hits also; with audio processing you’re doing the same operation over and over again on constantly changing data. You may find that 512kb sized buffers works best!

On modern CPU’s I think fixed point math doesn’t make that much sense anymore.

[quote]I think that the OSX Java is built by Apple and not Sun which is why there are speed differences;
[/quote]
Hm, why would ‘made by Apple instead of Sun’ imply lower performance?

I’ll look into the cost of float/int conversion. The Core 2 result is interesting. My guess is that Java is using SSE2 on that platform.

EDIT:

It’s just occured to me that this could explain the difference in performance between float and double on my machine. Since my chip only has SSE1 (which only deals with single-precision floating point), the JVM might be using that for the float method, but falling back to the x87 for the double method. This would also affect the performance of Math.sin().

[quote]My guess is that Java is using SSE2 on that platform.
[/quote]
You’re probably right (or SSE3, or whatever else). Also, looking at the results from using the -server switch, I would guess that some optimizations previously found in Java 5’s HotSpot server VM got added to the HotSpot client VM in Java 6.

OK, so I’ve added the following method (and generated a float array called wav_buf_float[] ):


	public float interpolate_float( float[] mix_buf, int length, float step, float phase ) {
		int idx, i;
		float c, m, x;
		for( idx = 0; idx < length; idx++ ) {
			i = ( int ) phase;
			c = wav_buf_float[ i ];
			m = wav_buf_float[ i + 1 ] - c;
			x = phase - i;
			mix_buf[ idx ] += m * x + c;
			phase += step;
			if( phase >= BUF_LEN ) {
				phase -= BUF_LEN;
			}
		}
		return phase; 
	}

The results are as follows (FLOAT2 is the result for the float-array method):


INT: 1516
FLOAT: 2547
FLOAT2: 2406
DOUBLE: 6078
SINE: 20703

So there is an extra cost to float/int conversion, but it only amounts to 5% or so. It’s nice to know that for the basic operations floating point is not far off the performance of integer code.

Yeah, although if you don’t mind the limited accuracy it can sometimes provide some extra performance.

I really should be working, but for whatever reason I can’t stop myself for being interested in this stuff :slight_smile:

So I tried a third float interpolate method, which uses fixed point for step & phase, but floats for the buffers. The only int-to-float conversion is setting x:

        public int interpolate_float3( float[] mix_buf_float, int length, int step, int phase ) {
		for( int idx = 0; idx < length; idx++ ) {
			int i = phase >> FP_SHIFT;
			float c = wav_buf_float[ i ];
			float m = wav_buf_float[ i + 1 ] - c;
			float x = phase & FP_MASK;
			mix_buf_float[ idx ] += ( m * x / FP_ONE + c );
			phase += step;
			if( phase >= FP_BUF_LEN ) {
				phase -= FP_BUF_LEN;
			}
		}
		return phase; 
	}

Mac Java 5 results:
INT: 953
FLOAT: 2667
FLOAT2: 1692
FLOAT3: 1108
DOUBLE: 2676

Windows Java 6 results:
INT: 660
FLOAT: 732
FLOAT2: 811
FLOAT3: 661
DOUBLE: 821

I wasn’t implying that Apple’s implementation must be slower, just stating that it is a different implementation altogether.

And although you find that addition/multiplications may be faster with integers, I find that floating point divides are much quicker on my system (Athlon XP 64).

 // reciprical divisions

public int interpolate_float4( float[] mix_buf_float, int length, int step, int phase ) {
	float fp_one_rec = 1f/FP_ONE;	// reciprical 
	for( int idx = 0; idx < length; idx++ ) {
		int i = phase >> FP_SHIFT;
		float c = wav_buf_float[ i ];
		float m = wav_buf_float[ i + 1 ] - c;
		float x = phase & FP_MASK;
		mix_buf_float[ idx ] += ( m * x * fp_one_rec + c );
		phase += step;
		if( phase >= FP_BUF_LEN ) {
			phase -= FP_BUF_LEN;
		}
	}
	return phase; 
}

Windows Java 6 results (Athlon XP 64+ 3200):
INT: 781
FLOAT 1: 2235
FLOAT 2: 1390
FLOAT 3: 1328
FLOAT 4: 1157
DOUBLE: 2000

I think the main bottleneck with interpolate_float2 is the float-to-int conversion. I’m not sure if it’s a HotSpot issue, an Intel issue, or just an issue in general.

I tried interpolate_float4 - didn’t see an improvement on my Intel machine, but it’s interesting to see the speed up on AMD.

Edit: Seriously, I’ve got typo-fever today.

What’s striking is the difference in the differences (if you get what I mean). Floats are much slower on my processor whereas integer multiplications tend to have little difference between the two. Having said that are you running a dual core processor?

Perhaps on Intel chips. On my machine the conversion has a cost, but it’s relatively small. I found I could actually get more performance by using a float mix_buf[], but keeping the short wav_buf[]. Perhaps the memory bandwidth reduction made up for the difference. Another possibility is that int->float is fast, but float->int is slower.

EDIT: Sorry, I wasn’t reading your post carefully enough. I think you made the same point :slight_smile:

Yeah, I’ve used that method in some previous code. Didn’t benchmark it, but I did assume it would be faster than indexing through a floating point value.

I can confirm though that rounding-and-subtracting to get the fractional part is a lot faster than modulo-division.

Well naturally, bearing in mind that modulo-division requires a division!