premature optimization, etc

EnderGT · December 5, 2008, 4:34pm

I know this is premature optimization, but the thought occurred to me…

The question is: is it more efficient to have 20 or so single-letter int variables (on the stack), or one int array? Does the array reference cost anything significant that would negate the benefit of removing the 20 or so variable names? In my game, the array(s) will be referenced quite heavily.

Riven · December 5, 2008, 4:44pm

It depends.

Local variables can be placed in registers, which are extremely fast.
int[] are placed in the heap, so in optimal conditions the int[] is in the cache, which is much slower than registers.

There is also the traceoff of maintainability.
Make it work.
Make it stable.
Make it fast.
Make it ugly, yet 2% faster.

Last but not least ‘single-letter int variables’ are as fast as ‘three pages long int variables’.

EnderGT · December 5, 2008, 4:47pm

In the case of the game I’m working on right now, speed is unimportant, but the final size is going to be critical. I have so much going on, I’m going to need to squeeze every bit.

I’m just curious if using one int[] would be less bytecode than ~20 individual int variables.

Riven · December 5, 2008, 5:11pm

I didn’t realize this was the 4K forum… My bad!

In that case, even then the 1 character ints are as small are the fully named local variables.

A simple multiplication function shows what the generated bytecode is like.

When using an int[20], you’d have:
[aload_x + iconst_x] or
[aload_x + bipush + byte] or
[bipush + byte + aload + bipush + byte] to access an element

When using 20 ints, you’d have:
[iload_x] or
[bipush + byte + iload] to access a variable

My guess is that using ints is a bit smaller, until you need to to math on them, where you’d otherwise use a loop.

EnderGT · December 5, 2008, 5:24pm

Bear with me, but the “common knowledge” is to use 1-character names rather than fully named variables, as it results in smaller, more compressible code. Is this untrue?

That being said, my question is really about using 20 int variables vs 1 int[20]. I’ve never dealt with the bytecode before (a deficiency I should rectify one of these days), so I’m not really sure what I’m seeing in your illustration.

Riven · December 5, 2008, 5:31pm

class names and field names are stored in bytecode, so making them shorter results in smaller bytecode

local variable names are not stored in bytecode, so you can make them as long as you want

EnderGT · December 5, 2008, 5:37pm

And this just reinforces that I need to start learning about the bytecode.

Thanks for the info!

Abuse · December 5, 2008, 8:11pm

Though naming is irrelevant if you are passing your classes through an obfuscator & optimiser, as everyone in this competition should be!

The general purpose “iload x” instruction (rather than the special case “iload_[0-3]” shortcuts ) does not take it’s parameter (the index of the local variable) from the operand stack, it takes it as a unsigned byte parameter.

Therefore, (in the normal cases) getting a local variable onto the operand stack is either a 1 byte (e.g. iload_0 ) or 2 byte ( iload 0 ) instruction.

Accessing an array element is more complex.
At best it will be a 3 byte instruction (The array is stored in 1 of the first 4 local variables, and you are explicitly accessing an array element from 0 to 5. e.g. “aload_0; iconst_0; aaload” )
Typically however it will be much more than this.

Note, in the above I used the qualifier ‘in the normal cases’.
If you have more than 256 live local variables, the compiler will begin using the ‘wide’ instruction.
This preceeds the iload instruction, making the iload instruction expect 2 bytes for the local variable index rather than just 1.
Therefore, accessing local variables beyond the 256th becomes a 4 byte instruction!

Typically the compiler will shuffle local variables around, so as to make best use of the smaller instructions.
(or is it proguard that does this? I can never remember exactly what optimisations javac performs)

EnderGT · December 5, 2008, 8:18pm

It’s not the actual names I was worried about, it was the “space” required for multiple variable names, versus the “space” required for a single variable name and then array accesses.

Thanks to you and Riven for the excellent answers. I won’t bother with the array(s) unless I really feel it necessary.

Riven · December 5, 2008, 9:19pm

You have to keep in mind that there are no named local variables in bytecode. Local variables can even ‘share’ the same position on the stack.

Even if you have 300 local variables, but you only declare and discard 4 variables at a time, the compiler can keep all variables within iload_[0…3] (one after another - in time)