hypothetical memory usage question

Noobtastic · July 10, 2009, 5:22pm

Suppose I had something like 1 million instances of an object like this:

class foo {

byte a;
byte b;

}

And suppose that the first 500k have values assigned to ‘a’ and ‘b’, while the second 500k have ‘a’ and ‘b’ specifically set to ‘null’.

Do the first 500k and the second 500k use the same amount of memory?

Abuse · July 10, 2009, 5:41pm

bytes can’t store the value null.

On a more general note, an instance of an Object uses the same amount of memory regardless of the values assigned to it’s member variables.

Noobtastic · July 10, 2009, 6:15pm

So then, one must conclude that both the first 500k and second 500k in this scenario will require the same amount of memory.

DzzD · July 10, 2009, 6:26pm

[quote]Do the first 500k and the second 500k use the same amount of memory?
[/quote]
yes

I guess that 1 000 000 object “foo” will use 1 000 000*(4+1+1) KB of memory and so 6 000 000 Bytes, the objects themselve will only use two bytes but you will have at least one reference for each (that can be any kind of reference structure as array/list but you will have one at least)

not 100% sure but I think that reference use 4 Bytes on 32 bits platforms and 8 Bytes on 64 bits

Edit : also you cant set null to a & b they are natives bytes not objects

Noobtastic · July 10, 2009, 6:33pm

Your comment brings to mind another question, though it is no longer on the topic of the original post…

Vector vs. ArrayList: which is better in various situations? I have at times used them interchangeably.

For example, performance, memory use, code style, etc…

DzzD · July 10, 2009, 6:38pm

they are quite equivalent http://java.sun.com/j2se/1.4.2/docs/api/java/util/ArrayList.html , if you know the size of your structure (and if it diesn’t change often) you will get the best performance & memory usage with a native array Foo[] foos;

ewjordan · July 10, 2009, 7:18pm

…and even if you don’t know the size in advance, if you’re using the things to store/retrieve primitives like bytes, ints, floats, etc., for such large collections you’ll really want to write your own dynamic arrays backed by primitive arrays, since you’re probably looping over these things quite a bit.

If you search around the forums a bit, people came up with a very fast and full featured dynamic float array class a long time ago, in case you’re not up to writing one yourself.

Ah, and re: Vector vs. ArrayList, Vectors are thread-safe, ArrayLists are not. That means there’s a bit of extra overhead when you use Vectors. I’d use ArrayLists by default, unless you might be accessing the thing from multiple threads at the same time.

Orangy_Tang · July 10, 2009, 8:04pm

With an array of a million objects, the 4-byte per object overhead will kill you before you’ve even stored a single byte. Store two parrellel byte[] arrays instead and you’ll reduce the overhead from 4 million bytes to 8 bytes.

Noobtastic · July 10, 2009, 8:48pm

Holy heck you’re right! :o

Well, that will change some things.

Like this, right?

byte[1m] aFoo;
byte[1m] bFoo;

Much better, memory-wise, than:
ArrayList; // stuffed with 1 million

Mr_Light · July 15, 2009, 2:27am

and if you stick it in a Set and override equals() you’ll save the most. 256 * 256 = 65 536

meh, this all reeks of premature optimalisation. If your collection doesn’t change much just use copy-on-write variants.

ArrayList is not synchronised Vector is. these days there is lock coarsening so that might not matter much

princec · July 15, 2009, 10:30am

Only in yer very latest VMs. Stick with ArrayList

Cas

Noobtastic · July 15, 2009, 10:13pm

I went with two arrays instead of the Vector / ArrayList and saw a massive improvement. Thanks very much, all!

(btw it was 3.6 million not 1 million )