need for speed, java vs c++ and the stack

Yes, like this?


//C++ code

struct Mystruct{
  //some members
};

int main(void) {
  Mystruct mystruct;

  //do some stuff, call some functions

  //mystruct has not been effected by stack manipulation

  return 0;
}

[quote]“I don’t know assembly”
[/quote]
You should learn, its really quiet helpful.
:slight_smile:

[quote]7 Mil triangles? Buee my Riva TNT2 is too slow.
[/quote]
I once boiled 1.4million triangles/sec out of a TNT2. I think that was quite near to the limit … :slight_smile:

I could agree that techniques like escape analysis would be very useful to have as a (possibly) compile time process to flag to the GC stuff that can be hit straight away in lots of small GC passes. That is a completely separate argument to whether allocating things on the stack is a useful, or even valid, optimisation strategy.

Would be nice to see some of these improvements at the VM, but right now it’s of very little concern to us. Our performance is excellent using Java because we pay careful attention to how we work with the code and we run profilers over it fairly constantly. Memory allocation and the subsequence GC all but eliminated in our realtime apps (particularly in Avaitrix3D where we have zero allocation unless scene graph structures change dramatically). Right now, most of our performance issues are with libraries that we use due to some dumb decisions made on the part of those library implementors. For example, our biggest performance issue right now is the about of garbage JOGL is generating internally every frame. It’s about 1MB of junk every 5 seconds or so, which is unacceptable.

The point here is that you have far more pressing performance issues to care about in terms of the other libraries that you rely on, that will gain you far greater benefits, than some potentially trivial gains from something like this proposal.

Is there anyway to see/monitor garbage collection and stack size. It should be a performance parameter along with FPS.

Use -verbose:gc to watch for the heap size stats and GC runs. I’m not aware of any commandline options for viewing stack info.

If you wanted to write your own monitoring tool then some C/C++ and the JVMPI are your friend. Otherwise, it’s shell out cash for a commercial product. We use Borland’s OptimizeIt as we’ve found that has the best reporting and back-tracing capabilities of all the commercial tools.

I want System.gc(long nanos) which would force garbage collection to run for as close to the requested duration as possible. I recall hearing that this wouldn’t be all that hard.
The idea would be to use it to eat up the extra cycles during a frame by forcing a tiny GC every frame rather than letting the garbage build up enough that the GC pause might be too long.

[quote]I want System.gc(long nanos) which would force garbage collection to run for as close to the requested duration as possible. I recall hearing that this wouldn’t be all that hard.
The idea would be to use it to eat up the extra cycles during a frame by forcing a tiny GC every frame rather than letting the garbage build up enough that the GC pause might be too long.
[/quote]
That sounds like a very good idea. But you could do this in a slightly different way. You could create a global variable to contain how many objects in memory have no references. And after each frame, test if this variable is more than or equal to another variable which holds the amount of objects you allow, and if so, run the gc. You would want a predefined function that can return the number of objects with no references, so that you don’t have to increase the global variable inside every single function, is there a function for this in the core apis?


public class Game {
   private static int numberOfAllocations;
   private static int limitOfAllocations;
   public static void main(String[] args) { 
      initGame(); 
      while(true) { 
         refreshGame(); 
         testGC(); 
      }
   }
}

There isn’t and it’s a really bad design concept as it assumes a single application running in the JVM instance and breaks the security model badly.

Take a look at what Apple have with their JVM implementation and what Sun have coming down the pipe for JDK 1.5 where it will launch a single, shared JVM instance that can then have multiple apps loaded into it and run in parallel (ie the “java” command is now just a loader taht grabs the existing JVM instance and starts up a new classloader and drops in your new main into the classloader and off it goes). A global var would be horrible for that as now you have a global that contains stuff that may have nothing at all to do with your application… Worse than that, it now gives your application ascess to classes all over the place - include stuff it should never have access to, including security managers and other application’s protected memory spaces. And, with reflection, the security nightmare gets worse, because not only do you have access to other applications’ objects waiting to be GC’d, but now you can find out about them and execute methods, read variables etc. That’s a far worse situation than on a standard native app where you have to guess at the structure of the piece of memory that you’re randomly reading, and hope that you’ve targetted the right place.

What about personalised default package for each application? Aka everything public in default package would be public only for that application? And what about personalised protected space? Aka only calling package would be alowed to do a nasty work.
BSS segments could be shared too, but nobody things it’s a big problem. My biggest problem with multilauncher was how could I shut down that nasty JFrame and prevent System.exit at the same time.
Nice solution would be Runtime.addSharedProces(new SharedProcess(class or better file), this); //token for JVM saying this application would like to spawn another application and have control on its behaviour. (Like: forced termination, window closing and so on.) After spawned application push System.exit() it would just terminate its own resources, nothing else.

BTW 1MB / 5s ? What did you do?

To do that will require some rather large changes to the Java security model as well as the classloader system (the two are heavily intertwined).

As for the garbage, it’s entirely internal to JOGL and the swapBuffers() call. It’s creating complete copies of a number of internal structures, including some ByteBuffers somewhere. We’ve been watching it using OptimizeIt, but haven’t had the cycles to track down what the code is doing. Can’t remember the exact class names, but there’s a SampleBufferConstantValuePair or something like that that is the root of the problem.

[quote]Take a look at what Apple have with their JVM implementation and what Sun have coming down the pipe for JDK 1.5 where it will launch a single, shared JVM instance that can then have multiple apps loaded into it and run in parallel (ie the “java” command is now just a loader taht grabs the existing JVM instance and starts up a new classloader and drops in your new main into the classloader and off it goes).
[/quote]
Apple has no such thing.

I think you are confusing the the role of the “shared archive” which allows much of the JRE classes to be placed in a “pre-loaded” format on disk and then perhaps mapped to a shared memory region. As I understand it there is still a new VM for every program and the Java command does the same basic stuff that it does everywhere else in terms of starting a new VM.

Ok, that’s interesting because it’s not the understanding I’ve gotten from chatting with Apple people, as well as some threads over on the old advanced-java lists. Not that I’ve done any serious Mac-based development to confirm or deny though :slight_smile:

What they’re on about here is that “application separation” JSR, I forget which one it is; but it’s a proper JSR to get the JVM to run partitioned applications with varying degrees of application isolation. I don’t expect to see it any time soon… but the shared archive stuff is a step in the right direction. It’d be nice if they had a few extra packages in some other shared archives like AWT and Swing…

Cas :slight_smile:

Isolates are JSR 121 http://www.jcp.org/en/jsr/detail?id=121

the ts3207.pdf of javaone 2004 describes barcelona project (multi tasking jvm) and the Isolate API, unfortunately without a delivering date.

Too bad this is research-only, the charts show an impressive statup time reduction.

I’m wondering what would be the benefit of escape analysis over eden of generational GC. As far as i can think, they both fight for the same goal.
The only difference between them is that escape analysis has to generate data path for running code while GGC has to get a non referenced instance of a class in eden. Except that, their work is the same.

I think cheap allocations are actually possible -even when generating some decent amount of garbage - if data die fast, just as stack allocation implies. I’m really not sure that adding a second method would help enough to be valuable compared to time passed to implement it.

Or maybe they’ve got confused about the “Shared VM” technology?

It sounds like a VM that is, err, “shared”, but in actual fact it’s about storing a single copy of unchangeable core classes in RAM, and allowing all JVMs to access it. Saves a bit of RAM and a bit of startup time (assuming you’ve already gone through one slow startup already), but nothing that spectacular.

(amused) I suggested to my professor in my first year at uni that it would be a good and worthy project to write an NT service (yeah, that long ago ;)) that ran an “always-on” JVM, and could just be easily restarted (kill JVM, load a new one) if it crashed (as was likely back then in 1.1.x days). He agreed, and said some others had proposed it to, but no-one ever got around to trying it.

Here we are, years and years later, and java devs still want it and STILL haven’t got it :(. And nowadays we have ClassLoader stuff that would facilitate new possibilities on the implementation side.

Now, if Sun took a leaf out of MS’s book, we’d have had it ages ago - because it’s a great way of making java “seem” faster (IIRC MS did this with MSIE and with Office - preloaded them during boot to make them seem to load much faster than competitor’s products (although maybe it’s just the fact that parts of MSIE and office have been migrated into the OS, achieving the same effect, but even more cunningly). Smart move…albeit annoying if you don’t use them all the time!)

Blah, it’s a nice idea and one that we’ve probably all had at some point. But the language just wasn’t designed with something like that in mind, and it’s not so trivial to implement.

Just off the top of my head: There needs to be some kind of “application context” in which statics can exist, and a new “trans-application” static concept. Threads will need to be re-implemented as an IO blocking/timeslicing hybrid. Package-level visibility needs to consider applications in the same package that attempt to interact.

However, it is a fascinating problem, and I’m surprised that no university has generated a tech concept as part of someone’s PhD thesis. It’s exactly the kind of problem that academic types like to see. ;D

[quote]I’m wondering what would be the benefit of escape analysis over eden of generational GC.
[/quote]
Sorry, missed this comment before.

Yes, quite right - the Eden-space concept looks rather like a stopgap in between old-style memory management and proper escape analysis. And from what we see, it works very well.

The only real difference is that escape analysis knows ahead of time that an object is essentially divorced from the heap, while the Eden space can only find out at runtime. I’m sure that escape analysis will be slightly more performant, but suspect that the difference will only really be noticed in a very limited number of situations.

Me, I’m very happy with the GCs that exist today. And I’m sure things will get even better in the future!

Yup :). A friend and I came up with something like 2 or 3 new keywords you’d ideally want to introduce. IIRC, the idea was to say that encapsulation in java goes:

  • method-local

  • object-local

  • class-local

  • thread-local (NB: since then, Sun did actually add this to java, but not as a keyword - it IS treated specially though, in ways that you cannot simulate with pure java code (because it is hacked/hooked into the Thread source))

  • jvm-local

  • host-local (any JVM on this machine; not to be transferred to other machines if using a distributed system, e.g. a transparent DOA (like RMI, but transparent))

In the abscence of the last two as actual compiler recognized options, the plan was to assume that every class variable was jvm and host-local by default and implement that way. Nowadays, with a separate Classloader per-invocation, can’t you forcibly separate static calls to the same class? I believe you can… (which is why ClassLoader is one of the worst named java classes; it doesn’t so much “load” classes as “namespaces” them).