Best Practices for Creating New Objects

[quote]In C this matters, in Java it does not.
[/quote]
I’m curious – what do you mean by that?

I’m thinking it matters in both languages - either way it constrains the scope of n to the loop.

In C, every declaration of a variable causes memory to be allocated on the stack.

So in the following case, there will be 2 stack-allocations (pointer bumps) for the two ‘i’ variables:


for(int i=0; i<n; i++) {
}
for(int i=0; i<n; i++) {
}

whereas in Java the localvar-slot is reused, as we can prove that for every possible case, the ‘i’ will not ‘escape’ its scope, as Java does not have pass-by-reference.

In C you typically see this pattern:


int i;

for(i=0; i<n; i++) {
}
for(i=0; i<n; i++) {
}

I’m not sure what would make you think that.

For the code snippit in question, a C compiler is free to generate code as you describe, but it is also free to only bump the stack pointer to make room for one i, since only one is live at a time.

In fact, a C compiler is free to make no room at all on the stack (keeping “both” is in a register the whole time) since at no point is i's address taken.

True. If we look at what GCC 4.9.2 outputs even in -O0, then we see it does not use the stack at all, but two different registers.
In -O1 it performs soooo much optimizations that the assembly code it produces is so insanely clever, that it’s not funny anymore, really. :slight_smile:
For an overly contrived example like this…


static int test(int n) {
  int ret = 0;
  for(int i = 0; i < n; i++) {
    ret++;
  }
  for(int i = 0; i < n; i++) {
    ret++;
  }
  return ret;
}
int main(int argc, char** argv) {
  return test(argc);
}

…I would not have dreamt about what clever tricks the compiler actually generates under -O1, which actually really resembles this C code:


int main(int argc, char** argv) {
  return argc >= 0 ? argc + argc : 0;
}

and that even without using the ALU, but using the processor’s LEA instruction to double argc!
This is amazing!

I still get the two loops in -O1 on https://gcc.godbolt.org/, but -O2 cleans it all up.

Since I like thinking about these things, here’s a possible transformation path (how ‘clever’ is it?):

Loop fusion:

static int test(int n) {
  int ret = 0;
-  for(int i = 0; i < n; i++) {
-    ret++;
-  }
  for(int i = 0; i < n; i++) {
    ret++;
+   ret++;
  }
  return ret;
}

This will get picked up by SCCP if not other passes:

static int test(int n) {
  int ret = 0;
  for(int i = 0; i < n; i++) {
-   ret++;
-   ret++;
+   ret += 2;
  }
  return ret;
}

Induction variable substitution:

static int test(int n) {
  int ret = 0;
  for(int i = 0; i < n; i++) {
-   ret += 2;
+   ret = 2 * (i + 1);
  }
  return ret;
}

Dead store elimination:

static int test(int n) {
-  int ret = 0;
-  for(int i = 0; i < n; i++) {
-   ret = 2 * (i + 1);
-  }
-  return ret;
+  return n >= 0 ? 2 * n : 0;
}

Strength reduction:

static int test(int n) {
-  return n >= 0 ? 2 * n : 0;
+  return n >= 0 ? n + n : 0;
}

Inlined into main: (this would actually probably be first thing that happens, but w/e)


int main(int argc, char** argv) {
-  return test(argc);
+  return argc >= 0 ? argc + argc : 0;
}

And then instruction selection comes up with this:

main:
    lea     eax, [rdi+rdi]
    test    edi, edi
    mov     edx, 0
    cmovle  eax, edx
    ret

LEA is chosen possibly because, unlike ADD, the destination register can be anything, not just the left operand.

Oh, yeah and don’t do object pools for small objects, unless testing shows real improvement, etc etc… :persecutioncomplex:

Yes, you are right. It was -O2 with me, too.

Hehe. Sneaky, yes. :slight_smile:

And yes, GCC has to put in the guard that n actually was positive when the original loop was entered, otherwise we would get a negative optimized result.

Derp, you’re very right. Edited.

Although this is very true, I think there’s some value in training yourself to use more optimal patterns from the get go.

I don’t disagree with you, but as this thread shows, it’s not always obvious what “more optimal” means. Trying to outsmart the JIT compiler is often a bad idea.

My main point is that it’s a waste of time for novices to worry too much about “efficiency” before they really understand how the basics work. Writing better code comes with practice.

“Optimal” requires that you know what you’re optimising for of course. If you’re optimising for rapid coding time and brevity… you’d not be in the least concerned by pooling for example.

Cas :slight_smile:

Yep, exactly. Shaving a second or two off of an hour-long process isn’t really worth it, especially if it took you a week to develop and will take the next person to inherit the project a month to understand your code.

That’s an exaggeration, but I’ve seen enough over-engineered code to be pretty wary of anybody who wants to talk about “optimization” or “efficiency” before even identifying a problem or specific goal (“make it as efficient as possible” is not a specific goal).

You know what I think about overengineering: Usually, when we are not working for a company or some other client, we are our own “customers.” This is true with any hobby projects we do for ourselves, even if this is a community project. We finally have the say on things. And the requirements we pose to ourselves are mostly driven by curiosity (“can this be faster? and if, how?”) and any effort we take is not bound by critical resources, such as money or time. Of course, we do not have unlimited amount of time for ourselves to do stuff, but if we are our own customers, we can always prioritize. In the real world, we cannot change the prioritization of our customers who wants to have something that works, or co-workers to whom we need to communicate aspects of some code/framework/something.
When we are our own customers, we are free to decide on any of those things, and I believe that is what is really hard to get discipline over when developing software in a professional manner. And by “professional” I do not mean that the software or the skills one has is any better than before (of course you never end learning new things), in essence I just mean that “you are not your own customer anymore.” :slight_smile:

Object pooling is inefficient most of the time since Java 1.4.

Would you like to elaborate more on this statement?

Since the collectors implemented in Java1.4 and later scan the heap of live objects, the more live objects you have, the longer it takes. Worse, there are different, slower algorithms for stuff that’s been hanging around a long time (ie. stuff in a pool). And even worse there’s some complexity if old objects point back to new objects. All in all pooling is generally slower than not pooling in the long term, depending on how expensive your objects are to construct in the first place.

Cas :slight_smile:

Well, no optimization is “minor” enough when you are on a mobile, for instance. I remember when I refactored all my code to remove all the instantiations like the one in the first post, it made no difference on PC and on most testing Android phones. But in my own phone, which had RAM issues, avoiding continuous calls to the GC increased performance dramatically.

It depends on which virtual machine you use and I still think that some optimizations have no measurable effect even on a mobile phone.

It just proves the weakness of the DVM (Dalvik Virtual Machine) and/or ART (Android RunTime). The DVM is inspired of Apache Harmony, it doesn’t behave like OpenJDK. I don’t encourage the developers to write poor code, I encourage them to make some efforts at the appropriate moment when it is required. I suspect some Android games to throw the whole geometry of a level to the GPU, it hurts the performance and it’s not necessarily a problem of memory footprint.

Yeah, blame is absolutely on the side of the VM, but in the end you have to make it work. And I agree, if I had observed no lag, I’d probably have been happy without those optimizations. But who knows, if then some potential users with even a worse phone, would be unable to play. If you want a broad audience, you have to consider that.

The problem with mobiles, in particular, is that behavior is not uniform. Some mobiles, in particular those with a SGX video-chip, use RAM for video and are always short on memory. On those, you should try to avoid instantiations at all costs. for instance, my game uses 24Mb of RAM on a modern device… and 120Mb RAM in an older device! I believe the cause is the lack of a graphical chip that can work with compressed textures. Others have heating/battery issues if you do too much processing, so I’ve found myself making my code uglier and more complex, in order to cache calculations. Changes and optimizations that would be laughable in normal conditions, yeah, but it is a tough platform.

DavidBVal , I meant that object pooling can be useful on a regular basis on mobile phones not using OpenJDK, it isn’t relevant for Java game programming on desktop computers in general. I don’t advise the Java developers to use object pooling everywhere whereas it can be efficient under Android and with J2ME but not with Oracle Java SE Embedded.