Binary Data Files

In my application I have begun to use data files as resources in order to read in the game specific data that I need for the app. My question is - is my java code converting the textfiles of data into the correct format as I have ten different .txt files which I convert into binary format but once this is done each of the ten converted data files are slightly bigger than each of the equivelant .txt files. I realise that the reading in of the data in this format is quicker for the j2me app and that it is better to have all your game data into these data files for localisation and for heap memory etc but I also assumed that the actual size of a converted data file would be smaller than the .txt file it had been converted from?

Is this correct and am I converting the txt files in the correct manner i.e. (I am requiring the text files to be in UTF 8 format):

//Main idea of how the .txt files are being converted

ByteArrayOutputStream bout = new ByteArrayOutputStream();

              DataOutputStream      dout = new DataOutputStream( bout   );

dout.writeUTF(buffer);

dout.writeInt( Integer.parseInt(buffer) );

dout.flush();

bout.reset();

dout.close();

Thanks for any valuable advice

This makes sense to me.

  1. Writing a UTF may be a little longer because there is string lenth information being transmitted where in your text file this may not be the case. Also, Unicode may be larger than ascii. I don’t know the exact format of writeUTF, but my bet is that there is a little bit of overhead.

  2. Writing an int may also be longer because to represent the number 9 in ascii only takes 1 byte where as an integer is 4 bytes. The same goes for 10, 2 bytes in ascii and 4 bytes as an integer. Larger numbers will be larger in ascii, but I find that most of my programming numbers 1-4 digits making the average smaller than 4.

What you’re writing will be in close to the same format as a text file. The only difference in the text is that there will be an integer stored for length, and a null terminator for backward compatibility. That should only add about 3-5 bytes to the file. Your “writeInt()” call adds another 4 bytes.

Speaking of which, why are you storing both the text and the integer versions? Wouldn’t it be more efficient to store just one, then derive the other at runtime?