ENUM Memory usage in Java

pixelprime · December 17, 2014, 3:06pm

A quick memory-related question about enums in Java!

Supposing I had a 2-dimensional array of an enum type:


TileFlag[][] flags;

How much memory might it take to store an array of 10 x 10 (100) elements?

What about 1000? A million? More? Does it scale linearly?

So would I get away with doing something like this:


TileFlag[][] flags = new TileFlag[5000][5000];

Or would that be horrendously inefficient?

Thanks all!

gzuzboey · December 17, 2014, 3:59pm

Why don’t you try and see what happens?

Experience is the best teacher

princec · December 17, 2014, 4:00pm

enums are 4 bytes per reference like all other references (ie. same as an int). Sometimes 8 on 64-bit archs.

Cas

Gibbo3771 · December 17, 2014, 4:05pm

You may as well just store a short to represent the ID, an int if you need more values. Even for flags I would still use one of these 2.

pixelprime · December 17, 2014, 4:21pm

So, essentially, the recommendation here is to use primitives instead of objects to store the flags?

I’m guessing that I could map one to the other fairly easily in any case, since when I need to check what enum ‘type’ lives inside a specific index of the array, I could just grab the number and reference the appropriate ordinal inside the enum object:


short[][] flags = new short[100][100];

// setting values
flags[0][0] = (short)TileFlag.SOLID.ordinal();

// comparing values
if (flags[0][100] == (short)TileFlag.SOLID.ordinal())
{
    // ...
}

// setting types
TileFlag specificType = TileFlag.values[flags[0][10]];

That should keep the memory footprint of the (large) 2D array to a minimum, but also give me the same flexibility to use the ENUMs to make more ‘sense’ of the actual values being used.

Genius answer, thank you very much!

theagentd · December 17, 2014, 4:37pm

To minimize memory usage and improve cache coherency, you could use a single short[width*height] instead of an array of arrays (short[height][width]).

pixelprime · December 17, 2014, 4:46pm

Thanks for the input!

Assuming a 2D-array of short values with dimensions of 5000 apiece:


short[][] flags = new short[5000][5000];

Would changing this to a single dimension array offer significant performance benefits? Or would the change be effectively imperceptible?


short[] flags = new short[5000 * 5000];

Many thanks!

KevinWorkman · December 17, 2014, 4:51pm

Like others have said: why don’t you just try this out and measure yourself?

Worrying about efficiency before you’ve even tried anything is itself inefficient. Premature optimization is the root of all evil, and all that.

Gibbo3771 · December 17, 2014, 4:53pm

No, just do what is easier. The performance gain is not worth breaking your neck over and depends greatly on implementation and platform.

princec · December 17, 2014, 5:29pm

Riven did some interesting experiments lately on the difference between array[][] and array[] and I seem to recall his unexpected conclusion was that array[][] was usually considerably faster. Can anyone find that benchmark? Anyway I suspect that the cache implications were completely outweighed by some improved ability of Hotspot to hoist bounds checks when there’s no multiple involved in calculating the index.

Cas

BurntPizza · December 17, 2014, 6:20pm

I don’t know where that benchmark is, but I have found the same to be true in certain situations. It all depends on the bounds checks, although if the 2nd dimension is quite small you could see a larger improvement from flattening anyway.
I know that simple constant stride indices are simple enough for HotSpot, see here: https://wikis.oracle.com/display/HotSpotInternals/RangeCheckElimination

Note that EnumType.values() creates a new array each time it is called, but I see you are indexing values as a array, so hopefully you are already caching it.

pixelprime · December 17, 2014, 6:35pm

Thanks everyone for the input.

In answer to the statement about ‘try it yourself’; whilst I appreciate the sentiment here (since there are plenty of people on these forums looking for ‘quick answers’ with minimal research), I already have a large program that would take some re-engineering to apply some of the proposed principles. Naturally, knocking up a simple program to measure the efficiency gains on restructuring these arrays wouldn’t be accurate, since it wouldn’t take into account all of the other program activity that’s taking place at the same time as well (i.e. it’d be difficult to measure performance with everything else going on).

I understand where you’re coming from, though, but on this occasion some peer-Q&A would be more beneficial to prevent me making time-costly mistakes in this program’s development than simply experimenting.

In relation to another comment posted above about using the EnumType.values() statement to evaluate the required ordinal at runtime - I had no idea this returned a whole new array. I assumed that the EnumType was generated into an array-style accessor at compile time. I’m assuming that the way around this, as you suggested, would be to set up something like this (a one-time deal):


TileFlag[] flagValues = TileFlag.values();
TileFlag[] flagDefs = new TileFlag[flagValues.length];

for (int i = 0; i < flagValues.length; i++)
{
    flagDefs[i] = flagValues [i];
}

So flagDefs is now my main ‘cached’ enum reference array, rather than using TileFlag.values()[…] (if I make flagDefs static, as well).?

Apologies if the code is sloppy, been a long day today!

Thanks!

BurntPizza · December 17, 2014, 6:40pm

Yes that is what I meant, it’s because there is no way to stop one from modifying the ordinals in the array without doing a defensive copy.

KevinWorkman · December 17, 2014, 7:04pm

Any answers we post would also not take into account all of the other program activity that’s taking place in your environment…

If you’re just looking for which takes more memory: A or B, then writing a little test program and running it in your environment is the only way to get reliable numbers.

Roquen · December 17, 2014, 8:47pm

Riven’s benchmark was broken.