Physics performance challenge to J.Kesselman...

[quote]First off, I hope my rather childish turn of phrase didn’t drive you away in any part. As swpalmer indicated above, maybe I picked up on a tone that was intended to be there in your posts. If anything, it’d be pretty useful to have an objective C++ games developer around.
[/quote]
+1 :-/
not referring to you Kev, but now the sky seems to have cleared, I have to admit I had my share here too for a bit I’m afraid. I was basically trying to make (some of) the points princec and swpalmer have made in a much better way, but I just ended up annoyed and sounding like dork instead . :-[
Sorry about that :slight_smile:

Although I still think the physics benchmark can be useful, maybe we could also gather as much as possible profile data (starting with our own games perhaps) so we can actually see in numbers exactly how important pure number crunching really is across different games, so how much impact a, say 20% performance loss, really has in the whole picture.

EDIT: In my game, >70% of the time is spent in org.lwjgl.opengl.Window.swapBuffers, >15% in various other openGL functions, the remainder <15% in everything else. This is on the client VM.
Now I have to admit there’s not much else going on than fancy graphics zipping by that you have to destroy, but hey, it’s an example :slight_smile:

EDIT: Maybe I should start another thread for this…

[quote]But just imagine what would happen if there were benchmarks that showed 3D computation in Java were indisputably on a par with the very best C compiled with the very best C compilers, especially if the benchmark code were open so that all could see the comparison was fair?
[/quote]
Well that would be great for Java of course. But i don’t believe that will happen in the near future.
The key be “indisputably”. There are simply fundamental differences in how things are done in a managed environment like Java verses using a systems language like C/C++. Should the comparison be made with C/C++ using a garbage collector? Should the comparison be made using C/C++ with bounds checks on all relevant data access? The styles of coding must be different to code to the strengths of each language.

Good games that make money are what really matters, no matter how obsessed coders are with speed. In that sense the speed only has to be “fast enough”. There will always be a trade-off with safety and getting that extra 0.5% Java chooses to go with the safety, C/C++ chooses to go with the extra bit of speed. The main argument you will find in these forums is that the extra bit off speed doesn’t give you a better game. Though it may enable a few cool tricks that impress people into giving up some cash - i think that may be valid, but how much money does pushing the envelope that way really make? I don’t know.

[quote][…]
Not being from the games community as it were, I’m really interested from a hobbyist point of view about this. Is the whole the games community really obssessed with speed? I’d always assumed that first person shooters were that way oriented, but other games didn’t seem so bothered.
[/quote]
True. For first person shooters it’s at least 60fps. Jump’n’runs usually 30, racing games 30-60, beat’em ups 30 or 60, shoot’em ups [takumi style] 60fps/[konami style]30fps and puzzle games [with dirty rectangles] 0(zero)-30fps.

With hand drawn or pre rendered animations you have to cap the framerate somewere. If you reach that framerate on the targeted hardware, it’s fast enough. And it doesn’t matter if it needs 90% or 80% of the cpu power.

Eg if you want to write a 3d jump n run with 2d mechanics, wich runs well (60fps) at 5 years old mid end hardware (p2 400, 128mb and some accelelerator with 16mb), you could easily do that with java+xith3d. By the time you’re done, the recommended hardware will be 5.5 years old (creating the media will take more than half of the time).

Keep in mind that hardware usually doesnt last longer than 6 years (the resistors on the mainboard will just die one day). Also supporting hardware that old doesnt make much sense, why should someone with that kind of hw spend 20$ (shareware heh) for your game? He/she hadn’t spend 20$ for hardware, wich is 2-3 times faster. (In the year 2000 I got: a bigtower+temp controlled psu, a matrox mystique, p200+asus mainboard, 64mbram, sb16 and a gameport card for 20$ - build a nice router out of it ;))

However, I think Java is mature enough for AAA titles. There were already a handfull of games wich uses Java in some way. One important thing, wich was likely overseen… it takes time to write a game (wow :)). 2-5 years down the road. Machines with a handfull of ghz, 1-4gb ram, graphic cards with up to 2gb ram. Using Java or C/C++… it just won’t matter. It’s the speed of the graphics card wich will matter (it’s already like that in most games). Therefore Jeff’s statement was imo absolutly ok (it was ment to convince devs to start developing in Java).

Fine for benchmarking linear algebra etc, but I’ve not found these are not always representative of 3D code, simply because large Vector arrays invariably throttle on the L2 cache-memory bottleneck. Of course the native code does this as well, but there twice as many bytes to move in Java!

Andy.

[quote]… the native code does this as well, but there twice as many bytes to move in Java!
[/quote]
huh? I’m not sure I follow where you are getting that factor of two from?

Simple:

class Vector3
{
float x, y, z;

}

Vector3[] array = new Vector3[1000];
for (int i = 0; i < array.length; i++) array[i] = new Vector3();

The actual data per Vector3 x,y,z = 3 x 4 bytes = 12 bytes

VM object header per Vector3 = 2 machine words = 8 bytes
Object reference held in array = 4 bytes

Total storage 24000 bytes, would have been 12000 in C++.

But Java programmers know not to do that for large data sets.

Soemthing like this

class VectorArray
{
    private FloatBuffer floats;
    public Vector get(int i)...
    public void set(int i, Vector v)...
    public float getX(int i)...
    public float getY(int i)...
    public float getZ(int i)...
}

could be done, which basically has the exact same memory requirements as your C/C++ array. Yes this is sometimes inconvenient, but a lot of the time you might be sending vertex arrays to OpenGL or some such thing, so data needs to be formatted appropriately.

There are requests for enhancement (Cas’ “Structs” RFE) designed to address this issue specifically.

(edit)… Oh and I forgot some of the feaures you are missing to get the equivalent of what Java would have. You need Run Time Type Information for your Vector3 struct… that means bloating the struct by an extra 4 bytes on 32bit systems.

Calculating object size is not so simple in Java.
See for in depth explanations
http://java.sun.com/docs/books/performance/1st_edition/html/JPRAMFootprint.fm.html

However, your example indeed would be larger than a C struct, but NOT necessarily a C++ class of the same fields.

Of course this is dealt with the same way as it would be in C++, make float arrays as stated above.

I never have large lists or arrays of vector or matrix classes in my run-time. There are instances here and there but they are used for temporary holders when it’s convienent, such as in 3D loaders.
The run-time data is nearly all ByteBuffers since most is going over the pipe to graphics cards.
The exception is animation and particle data which is float arrays, vector/matrics, or other objects.

Make no mistake, Java takes up more memory than typical C++ classes. But then that’s more than C structs, and even more than pure, crisp, clean Assembly. And really Assembly is just the poor man’s hex.
And anyone writting hex, really should be entering binary with a binary keyboard - http://www.worth1000.com/view.asp?entry=84104&display=photoshop

In reply to the last 2 posts:

I also tried unformatted float arrays/native buffers. Improved things a marginally when I hit the memory wall, but killed things once the buffer is in the cache. This means coding differently based upon whether I expect the buffer to stay in the L2 cache between game loops or not :-/

Unfortunately the stuff I am working on cannot just fill a native buffer once at the start of the program - my geometry is dynamic.

And as I don’t use my Vector3 class polymorphically and I turn RTTI off, the C++ data structure is exactly the same as the C struct. The Java object costs I mentioned came from the latest HotSpot VM docs.

EDIT - I just compared some code C# that uses structs to see what this out-of-cache performance is like with value-type arrays, and it turns out it takes only 91% of the time that of a really good C++ program (without assembly) to process one of my Vector3 arrays! :o (Falls over in shock…) The guys at Sun really need to get cracking on ‘structs’ or whatever they are going to call them.

Shawn - keyboards are for wimps - whats wrong with plugging bits of wire into a breadboard ? ;D

So is mine. Why would you need to fill a native buffer once at the start of the program?
You can change native ByteBuffer data after creation.
What’s the problem?

So does that mean the developers should also be trying every C++ compiler out there for every platform they target to make SURE they are getting the absolute performance out of their code? :wink:

[quote]The fact is that the games community is obsessed with speed, which probably comes from too many 90 hour weeks trying to get their frame rates from 15fps to 60fps whilst not being allowed to simplify the game.
[/quote]
Amazing then that most RTSs clock in around 30fps or lower. Even the mighty (and quite awesome) Dungeon Seige clocked in between 18-26FPS on high end systems. Ah! You must be thingking about the FPS crowd :slight_smile: The fact remains that while we like to beleieve that the industry is completely speed obsessed, the fact remains that it is not. Look at the BIGGEST selling games of all time and you will see less speed intensive games than speed hooked games.

In fact, just look at Renderware. GTA and Tony Hawk? Not the most complex environemnts, high poly models, high frame rates, etc. compared to games like GT4 for example. It all boils down to what you are trying to build.

-ChrisM

Andy.
[/quote]

[quote]Amazing then that most RTSs clock in around 30fps or lower. Even the mighty (and quite awesome) Dungeon Seige clocked in between 18-26FPS on high end systems. Ah! You must be thingking about the FPS crowd :slight_smile: The fact remains that while we like to beleieve that the industry is completely speed obsessed, the fact remains that it is not. Look at the BIGGEST selling games of all time and you will see less speed intensive games than speed hooked games.
[/quote]
Must be that we Brits are far too obsessed with racing games :wink:

Of course it depends on what you are doing. And for my part, I’m not obsessed with the frames-per-sec (we seem to be quite happy with 24 at the cinema, but then they have motion-blurring!), but with trying to do a lot of complex maths in the tiny amount of time I have per-frame.

Anyhow Chris, a much more pertinant question: why don’t Apple have a Server VM on the Mac? Seems like a daft thing for Sun and Apple to allow to happen (unless Sun is viewing the Apple X-Serve as a threat ;))

A.

[quote]Anyhow Chris, a much more pertinant question: why don’t Apple have a Server VM on the Mac? Seems like a daft thing for Sun and Apple to allow to happen (unless Sun is viewing the Apple X-Serve as a threat ;))
[/quote]
Shouldn’t that be a question for Apple? In any case be sure to file a bug report with Apple about this. That’s apparently something they use to prioritize their work. I know that they are getting the message with regard to the poor graphics performance in their JRE… but I’m not so sure that they have got the message with respect to needing a server VM.

Anyway… above you quoted some 91% figure with C# and I didn’t quite get what you were trying to say? Are you saying that C# was faster than C++ for that case? (and Java is slower, thus making it look that much worse?)

In any case… enough of this talk… why not come up with a benchmark program and we can see how it fares? The guys here can help to make sure it is coded properly and we can try our best to make it a fair test. Realistically we should track the development time and all that crap to get the whole picture… but since right now you are only interested in execution speed, it will make things simpler if we concentrate on that.

BTW,

If anyone wants to see the “fabled” Java FPS for themselves, check out this video clip from the 2001 GDC: http://www.imilabs.com/media_Jamid.htm

As well, you can see pieces of the fighting game and F! racer (which was demoed AGAIN at this year’s GDC). Lastly, you can read this interviewer’s impressions of said games.

http://archive.gamespy.com/gdc2002/jgp/

-ChrisM

[quote]BTW,

If anyone wants to see the “fabled” Java FPS for themselves, check out this video clip from the 2001 GDC: http://www.imilabs.com/media_Jamid.htm

As well, you can see pieces of the fighting game and F! racer (which was demoed AGAIN at this year’s GDC). Lastly, you can read this interviewer’s impressions of said games.

http://archive.gamespy.com/gdc2002/jgp/

-ChrisM
[/quote]
Looks ok, but until I could try it on a selection of hardware I couldn’t draw any conclusions. I presume as you posted a video the said racer is not available for download?

A.

Sure it’s a question for Apple, but I’m sure Sun also know the answer (strange silence every time I ask the question though). I can’t imagine the fact that Apple were going to run a server product (WebObjects) on a client VM didn’t get picked up by the guys at Sun (you want to do what!!!). And as it’s Sun thats pushing the cross-platform abilities of Java gaming, I’m asking them.

Yes, C# was fastest. Relative to C++:
C# 91%
C++ 100%
Java ~200%

When I’ve got some time I’ll think of a benchmark to post, I’m sure you guys will rip it to shreds :wink: Can’t send any of the stuff I’m working on though.

A.

Let’s just say that Sun and Apple are working together, and we have a good relationship with them. However, these are 2 big companies setting their agendas moving forward. Not with specific regard to Apple and Sun (because I have not sat in the Apple/Sun discussions), but sometimes the reason technical things don’t get done have nothing to do with technology at all and let’s leave it at that. :slight_smile:

With regard to the release of the benchmarks, benchmarks are just like salesmen: they are what you want them to be. Not slamming what you are proposing, but it is what it is. From Futuremark to DS Benchmarks, they have all been proven to have been skewed in one way or another.

Lastly, with regard to not being able to see the work you are doing that you would pull the benchmark from, how would we know it’s valid? This is the same argument you gave here with regard to the FPS demo. At least we could show the video clip.

But that’s ok and cool with me. Why? Because I know WHY you can’t show the code. It’s internal work and there are somthings you just can’t show to the general public. I fight that all the time. It would be great if everyone’s motives were pure so we could share some things we would like to, but that’s not the case.

At the very least, this has been a very interesting thread :slight_smile: Glad to have you here in the community and hope you have some fun here as well :slight_smile:

-ChrisM

Chris, I don’t suppose you could make some noises and throw up a tantrum about our need for Structs could you? I don’t care how exactly they’re implemented, just so long as they are implemented, in one form or another and work as fast and memory efficiently as their C++ or even C# counterparts.

Cas :slight_smile:

[quote]…but sometimes the reason technical things don’t get done have nothing to do with technology at all and let’s leave it at that. :slight_smile:
[/quote]
Thats what worries me…

[quote]With regard to the release of the benchmarks, benchmarks are just like salesmen: they are what you want them to be. Not slamming what you are proposing, but it is what it is. From Futuremark to DS Benchmarks, they have all been proven to have been skewed in one way or another.
[/quote]
Well my saleman would be demonstrating the need for structs! ;D

[quote]At the very least, this has been a very interesting thread :slight_smile: Glad to have you here in the community and hope you have some fun here as well :slight_smile:
[/quote]
Well at least a lively debate keeps us all from becoming complacent :wink:

A.

15 FPS are enough in most cases 30-60 are nice and would be common case. One of my test programs is using 30 FPS in 1024x786 with 90 UPS.

I played Morrowind on freaking 5-15 FPS. However some people likes to show its computer to other and say I used xxx on 150 FPS, and so on.

If you are using a flight stick with the force feedback you should do over 1 kHz UPS to create the right feeling.
Some things could teorethically make difference between frames under 160 FPS.