I hope this is the right section, if not feel free to move it!
Does declaring a method as final allow java to treat it as a macro so that it is substituted instead of having the overhead of a method call? If not then how could one create a “macro” or “inlineable” method?
You don’t.
Inlining a method is completely up to the JVM implementation. Declaring something “final” doesn’t mean it will or won’t inline. Only use final for design reasons, it most probably won’t make any difference, performance-wise, and even if it does it is specific to that one JVM you’re running on.
In my experience, it’s best to rely on HotSpot’s voodoo magic, and not try to make it do things. Hotspot will inline whatever it thinks will lead to a valuable optimization, and inlining might not always lead to better performance.
There was a time when declaring methods as final did make a (performance) difference. Which is why you still sometimes see advice suggesting this. It might still make a difference in some environments — perhaps J2ME for phones and similar. However if your target is desktop or server environments and you are not running a museum, then as Erik suggested, don’t use final for this purpose.
Declaring a method as “final” has the advantage that the runtime is guaranteed that always the same method entry-point point is taken since the method may not be overridden.
Modern JVMs are able to optimize this pretty well, however theres still some overhead associated with it - but by far not enough to desctroy your code by declaring methods as final which should not be final by design.
I would suggest using a bytecode-optimizer as proguard ( http://proguard.sf.net ), which is able to perform closed-world optimizations on your code (making everything possible final, some stuff static, inline getts/setters in bytecode) which reduces code-size and makes your code even faster withought destroying your source.
Personally, I’ve never seen the slightest performance improvement after declaring stuff “final”.
As for proguard, I just gave it one more shot with JEmu2 (I’ve tried it before with the JEmu2’s predecessors).
At first I got a got a StackOverflowException. After some twiddling, I got rid of the error by disabling optimizations, and the result was a .jar of significantly reduced size (enough to make it worthwhile). Unfortunately, I got a security exception: something to do with the byte code verifier.
After doing more playing with the various options, I finally got something that worked, but it was only <2kb smaller than the original (which weighs in at 987kb) and I had to disable basically everything. So for me it was not at all worth the effort, but of course YMMV…
Maybe I was just not meant to use obfuscators ;D
Sometimes if you inline too much you end up with, say, a loop that’s just a bit too bloated to fit nicely in the L1 CPU cache, which is a big performance hit as the cache will just thrash back and forth to the L2 cache as the loop executes each iteration. And other crazy cases.
Decalring soemthign final has a very specific semantic meaning which is that it cannot over-ridden.
Thsi is unecessary for inlining. Modern VMs use much more sphisticated techniques to determine inlinability.
That answer the question?
Edit: Cas is correct too, if thats what you wer askign and i misunderstood. too much inlining can screw up cache performance. One of the many advantages a run-time compiler has over an ahead-of-time compiler is that it can adjust the size of code blocks to fit your current cache size.
[quote] One of the many advantages a run-time compiler has over an ahead-of-time compiler is that it can adjust the size of code blocks to fit your current cache size.
[/quote]
Do you know if HotSpot actually does that already, or is this a possible future enhancement?
I’m not so sure Hotspot currently knows about the size of the L1 caches but the inlining amount is tuneable with a trio of commandline parameters. When you’re at the stage of fiddling with inlining tuning you’re probably deep into the realms of platform-specific optimisation though for released products.
Exactly. You can’t expect your customers to go fiddle with such platform dependent, specialized JVM-specific options. The JVM should be doing that automatically at runtime.
I would strongly advise that you never ever touch that unless you have a hugely important reason to.
Proguard does not generally improve performance at all, it just decreases download sizes a little - and it has many small-but-irritating-and-time-wasting bugs like the one described above (I have to use it for stuff, sadly, because of a checkbox “has been obfuscated?” I have to be able to tick. Even though obfuscation has no postive effect at all. Don’t ask :().
You will get large numbers of people unable to play your game because of bugs in the obfuscation parts (if you use them; most of the perf opt I’ve seen is special cases of obfuscation), and the stack traces they send you will be useless.
OTOH, if you’re writing a 4kb game, every byte counts :D…
Yes, there are some bugs in it - but hey have a look at the java-bug-databse. java imho very buggy and you still use it
Well to test it is in general a good idea
The only problem (which is documented) I’ve seen with proguards optimizations is agressiv overwrriting of methods which does not wirk with early JRE-1.2 BETAs - if it works on one JVM it works on any other too.
Wrong - stack-traces are still valid, theres a tool which converts them (you can even force to include line-numbers).
Inlining could cause unneccessary high dumping of micro instruction cache. It could also force CPU to convert native code into CPU microcode at runtime. Current CPUs have aprox 12 - 16 kinstruction cache, so it could happen quickly.
[quote]stack traces they send you will be useless.
[/quote]
Not strictly true, proguard can generate an obfuscation map to convert obfuscated stack traces into meaningful traces.
Proguard can break any code I write, so that I have to re-write my code to make PG happy. No library I can use will do that to me. The risks are greater, and I dont have time to be debugging two platforms in parallel. Shrug. Maybe you dont care, but IME most people do, as soon as they start running into artificial bugs that are being introduced by the “optimizer”.
That would make things a lot less worse. Have you got a link?
If some of the optimizations break code, I guess you should file some bug reports. I remember Eric (Proguard author and coworker) telling me that he was very reluctant to implement byte code optimization because he doubted the net effect of this and that it could break code. So my guess is that if you’ve found a bug he’ll be more than happy to fix it.
Retrace is part of the proguard distribution and does exactly that