Packing bytes

Hi,

I was wondering how much a difference does packing data to take smaller amount of memory make in Java.
In C++, let’s say I have a struct like this:

struct MyTile
{
   int x,y;
   bool bTaken;
};

This struct will take 4+4+2 = 10 bytes. If I have an 1000x1000 map then it will take 10.000.000 bytes = 9.5 MB.
So now I take that neither x or y can contain a number larger than 1000 due to the map size. The number 1000 can be contained in 10 bits, so for both numbers I only need 20 bits. The boolean bTaken can only be of value 1 and 0, so that’s one bit. All in all I can place the entire struct in 21 bits, and because one int is 32 bits I can place the entire struct in one int. This way I’ll be using only 4x1000x1000 = 4.000.000 bytes = 3.8 MB, so I saved about 5 MB of memory with that compression.

Can this same thing be done in Java? I mean I know there are bit operators, but how much actual memory will I save by doing this. Instead of a struct I would probably have a class. How much memory does a class take?
I’m guessing it’s probably more efficient to create 1 million ints that to create 1 million classes which hold 1 int, am I right?

Yes, you are correct. And booleans take 4 bytes in Java, too.

Cas :slight_smile:

The boolean bTaken can only be of value 1 and 0, so that’s one bit.

It’s whatever the vm wants it to be. 1 byte, 2… 4… whatever aligns nicely.

How much memory does a class take?

Like 40 bytes overhead.

I’m guessing it’s probably more efficient to create 1 million
ints that to create 1 million classes which hold 1 int, am I right?

Given that overhead… yes it’s more efficient to put those values into one place.

class someClass{
int[] struct = new int[1000000 / (32) + 1]
boolean isHere(int positionA, int positionB){
int pos = positionA * 1000 + positionB
int c1 = struct[pos / 32];
return ((1 << (pos % 32)) == 0) ? false : true;
}

Was it this what did you have on mind? It’s even more memory efficient than 3.8 MB.

If you are not interested in that, but rather in some asymetric tiles values, you would need to play with OR and bitshifting.

Object overhead is 12 bytes of course.

Only if you’ve forced the packing of structs to 1 byte alignment. That will also have the effect of causing x & y to be misaligned for every other struct if you make an array of these structs.

[quote]> The boolean bTaken can only be of value 1 and 0, so that’s one bit.

It’s whatever the vm wants it to be. 1 byte, 2… 4… whatever aligns nicely.

again worth noting that this too is 100% VM dependant.

[quote]>

I’m guessing it’s probably more efficient to create 1 million
ints that to create 1 million classes which hold 1 int, am I right?

Given that overhead… yes it’s more efficient to put those values into one place.
[/quote]
Its a safe bet that there is some object over-head on any VM so
this is a safe statement. How significant that over-head is though will vary a lot from system to system.

The point is this can be done and it will save some memory. If Java guarantees that an int will be 4 bytes on every machine then that is all I need, the packing algorithm I’ll decide myself to work the best it can.

How significant that over-head is though will vary a lot from system to system.

True. But given 1 million tiny objects the overhead just will be unacceptable big (ratio wise).

That “like 40 bytes” was a figure, I pulled out of the air. The overhead was about that size iirc. Well, basically all I wanted to say is that the overhead doesn’t matter as long as you don’t have countless tiny objects… like one million objects which only hold an int and a byte. :slight_smile:

Structs! Structs! Structs!
er…

Mapped Objects! Mapped Objects! Mapped Objects!

Cas :slight_smile:

4 Bytes pointers to object
8 Bytes object id.

At least It was the result whe I last time encountered an OOP.

If it works for Steve Balmer…

You could try
http://javolution.org/api/javolution/io/Struct.html

intreting. I assume what it is really doing is shadowing…

The cosntructor uses reflection to find its feild and builds an internal 1:1 mapping of class field to position in byte buffer. His C-type classes then know how to read and write from the shadow.

A reasonably clever solution. Disadvantage ofcours being your sre doign data copying every time you access to write or change values sucha s shifting the struct position in the bufffer. Doubles memory usage and has some CPU overhead.

It’s the memory bandwidth overhead too that’s the problem. The solution is however pretty much what I was thinking some enterprising VM engineer might be able to “intrinsify” somehow…

Cas :slight_smile: