Trade and Space Sim (aka Elite4k): Some questions

Hi,

I’ve been working on Elite4k, a trade and space sim, since December. The game is basically a clone of evergreens like Elite and Privateer: You fly your spaceship from planet to planet, trade goods, buy upgrades, hunt pirates, mine asteroids or turn pirate yourself … I think you all get the picture :slight_smile: I’ve almost finished the game now, with a size of around 4.8k and enough optimization potential remaining (hopefuly :slight_smile: to reach 4k. However, some issues turned up during development which keep bothering me. I would be grateful for any input on the following questions:

Text rendering I’m currently using drawBytes to draw text from a resource byte array to a graphics context without string overhead. Is there a smarter way to do this?

Number rendering I’m currently using a method which fills a byte array from a given number by setting the current modulo-10-value of the numeric value to the current byte, then decimal shifting the number and advancing the output byte. Is there a smarter way to do this? Perhaps some light-weight number format method in the API I don’t know about?

Matrix inversion For the 3D space sim, the inverse camera matrix is required. Two ways to get it: Matrix inversion (my unrolled version is 12 lines with lots of multiplies and adds) or premultiplication with the negated matrix components. Which way to go? Premultiplication eliminates the inversion code but requires some control flow in the geometry pipeline. Any experiences?

Loading and Saving For a game where you can easily spend dozens of hours, a load and save game function is a must. The amount of game data I have to load and save is around 20 bytes. One option is to write out a text representation of the values and let the user note it down and reenter it to load a game (old-skool console style). However, 20 bytes mapped to easily typable characters is a lot of text. So I consider using the clipboard: Copy to get the current game state as a string, paste the string (which you saved in some file) to resume. Does this make sense? I’m a bit in doubt

Array Access When accessing array cells in approximately linear (depending on control flow) sequences, is it better to have something like “data[basepointer+1] (…) data[basepointer+2] (…) data[basepointer+27] (…)” or is it better to go unary where possible “data[pointer++] (…) data[pointer++] (…)”?

Thanks for your input

Wolfgang

“I’m a ****ing starship, I’m allowed to cheat!”
GCU Arbitrary, Culture Craft

For numbers perhaps this might be smaller?

g.drawBytes(Integer.toString(val).getBytes())

no idea tho…

Loading and saving: using base 64, 20 bytes is 27 characters. That’s not much longer than the passwords in some games I’ve played, and it’s a bit more intuitive for a non-technical user than putting stuff in the system clipboard.

Array access: data[pointer++] repeated 27 times will compress a lot better than data[pointer + 1]…

Sounds very impressive!
Hope it lives up to expectations :smiley:

  1. I’m pretty sure drawBytes is the cheapest way of drawing text
  2. I think you’re right again, the overhead of the necessary api method would be prohibitively expensive
  3. sorry can’t help there.
  4. I’d bite the bullet & require the user to write down the code - I think you are going to be extremely strapped for bytes, making the usability hit an acceptable loss.
  5. It depends.

It boils down to :-


pointer++
iload x  // 1 or 2
iinc	x, 1 // 3 bytes 

Vs


pointer+1
iload x // 1 or 2 bytes
iconst_y / bipush y / sipush y // 1/2/3 bytes (or more if a full int is needed)
iadd // 1 byte

Given the iload instruction used will be the same in both instances, they can be compared solely on the cost of the other instructions, giving a comparison of 3 bytes Vs 2+ bytes.

Ofcourse, this is all irrelevant, as ‘pointer++’ will undoubtably compress far better, so should always be used.

:edit:

it’s worth noting that

a[counter++] = 1;

Requires exactly the same bytecode (though in a slightly different order) as :-

a[counter] = 1;
counter+=1;

Which in turn is the same size as :-

a[counter] = 1;
counter+=2;

This is worth knowing, as people might expect ‘pointer++’ to generates smaller bytecode than ‘pointer+=127’

This is what I found:
Clipboard: Toolkit.getDefaultToolkit().getSystemClipboard().setContents(new StringSelection(""), null); ==> adds 150 bytes
Dialog: JOptionPane.showInputDialog(this, null, “”); ==> adds 61 bytes

It’d be smaller still if you used the variant without the ‘parentComponent’ parameter.

Or even smaller if you just draw it to the screen, since OP is already using methods to draw text to the screen.

Invaluable! Most of the rendering and game logic codes does little more than add and mul based on pointers. This will help a lot!

Thanks

Wolfgang

Thanks for the input on the issue! Fortunately, neither the input nor the rendering of the save-game-string is the problem, I’ll probably end up well below the mentioned byte counts by integrating it into the existing event logic.

From the responses so far, I conclude that a save-game-string of, say, 30, alphanumeric chars wouldnt be a problem (correct if wrong).

Thanks

Wolfgang

I tested this but (a) it compresses a lot worse than the solution I use now and (b) it does not support some required features. I need to draw tabular displays, with right-aligned numbers and a unit sign at the end of each number. Here is what I use now, input welcome:



    /**
     * Formats a number into a byte array which can be rendered
     * as a string.
     *
     * @param numVal Numeric value to write
     * @param chrUni Sign to append to value
     * @returns Byte array of formatted numeric data
     */
    private byte[] f(int numVal, int chrUni)
    {
        // Define result
        byte[] resVal=new byte[8];
        // Write unit sign
        resVal[7]=(byte)chrUni;
        // Write values
        for(int digPos=6;digPos>=0;digPos--)
        {
            if(numVal>0)
                resVal[digPos]=(byte)(48+numVal%10);
            else
                resVal[digPos]=(byte)32;
            numVal/=10;
        }
        // Write zero
        if(resVal[6]==32) resVal[6]=48;
        // Return result
        return resVal;
    }

Thanks

Wolfgang

Without spending any time actually testing things, one probable-optimisation which jumps at me is moving the “Write zero” up before the “Write values” and skipping the test. I.e. replace

        // Write values
        for(int digPos=6;digPos>=0;digPos--)
        {
            if(numVal>0)
                resVal[digPos]=(byte)(48+numVal%10);
            else
                resVal[digPos]=(byte)32;
            numVal/=10;
        }
        // Write zero
        if(resVal[6]==32) resVal[6]=48;

with

        // Write zero
        resVal[6]=48;
        // Write values
        for(int digPos=6;digPos>=0;digPos--)
        {
            if(numVal>0)
                resVal[digPos]=(byte)(48+numVal%10);
            else
                resVal[digPos]=(byte)32;
            numVal/=10;
        }

That saves a few bytes in the uncompressed version.

You are completely right. Will work and save a byte or two.

Thanks

Wolfgang

note - pjt33 replied while I was typing, so there maybe some duplication :wink:

Only 2 optimisations I can see missing are to use the ternary operator & include the final if statement inside the loop.
Both of these eliminate the duplicate array access instructions, which are rather large.
Together. i’d guestimate around 10 byte saving?

Forgive the tampering with your literals - embedding non-trivial literals is one of my pet hates ;D


    /**
     * Formats a number into a byte array which can be rendered
     * as a string.
     *
     * @param numVal Numeric value to write
     * @param chrUni Sign to append to value
     * @returns Byte array of formatted numeric data
     */
    private static byte[] f(int numVal, int chrUni)
    {
    	// limit on the number of digits of numVal that will be displayed
    	final int MAX_DIGITS = 7;
        // Define result
        byte[] resVal=new byte[MAX_DIGITS+1];
        // Write unit sign
        resVal[MAX_DIGITS]=(byte)chrUni;
        // Write values
        for(int digPos=MAX_DIGITS-1;digPos>=0;digPos--)
        {
            resVal[digPos] = (byte)(numVal>0||digPos==MAX_DIGITS-1?'0'+numVal%10:' ');
            numVal/=10;
        }
    	// special case to handle numVal==0 moved into loop.
        
        return resVal;
    }

:edit:

Further optimised by combining the 2 ternary expressions inside the loop into a single or’ed one.

erm, are you sure?

if numVal is passed as zero, it’ll set resVal[6] to ‘0’, and then immediately set it to ’ '.
Breaking what you intended it to be doing.

Correct, and exiting early reintroduces the if. Stupid me. Thats what you get for debugging a geometry pipeline while playing solitaire with your wife and posting in boards simultaneously :slight_smile:

Anyhow, your optimizations have put an end to that discussion, very very nice!

Thanks

Wolfgang

PS: “Whoever finds embedded literals may keep them” ;D

I’m looking forward to seeing this game! ;D

Faster! Faster!

Doh! There’s an else clause.

Of course, if we can change the requirements we could have it pad with 0s rather than spaces for simpler, shorter code with a retro feel…

Yeah, i wrote that at 2am… my brain was not really firing well… it makes sense that it should compress worse!

Anyhow, your solution raises an issue I’ve been thinking about a lot: API calls versus do-it-yourself. As I see it, there are three basic cases. Two are no-brainers:

Never ever use the API, as in max/min or event handling, where the API calls compress much worse than your own constructs.
Always use the API, as in drawing ovals or filling polys, where your own constructs get much too long.

But the third is tricky: To use System.arrayCopy() or Arrays.fill(), or not? My guess is that for using such methods once, your have to pay the full overhead, but by using them frequently, the overhead should decrease … where is the breakeven, when your own constructs start to get more expensive than the call overhead? I’m aware that this is almost a philosophical question and depends on the type of method we are talking about.

I’m drawing my circles myself. Admittedly this is in large part because I’m doing complicated things with them (e.g. drawing an anti-aliased circle to the alpha channel).