Coordinates representation: vectors vs int,int

DooM77 · March 3, 2009, 10:01pm

The need of processing some form of coordinates (tile coordinates, pixel coordinates and so on) is very common in game developing. My post is about comparing the papproach of using vector object (for example 2-dimensional vectors) with approach of using pairs of int (for example x,y) in every place we need to process coordinates.

//Vector version
public Tile getTile(Vector2 coords){
		return tiles[coords.getX()][coords.getY()];
}

//Pair of ints version
public Tile getTile(int tx,int ty){
	return tiles[coords.getX()][coords.getY()];
}

Here are my recent experiences with vectors. At some point I realized that most of my methods is having two overweighted versions: one taking vector parameters and second taking int,int parameters to express coordinates (like in example above).The rest of them are having only vector or ints version what is even worse because I have to remember all the time whether method 'fun` that I want to use is accepting vector or pair of ints as parameters. It was too big mess for me so I decided that I will unify the whole system and all my methods operating on coordinates will be using only vectors. After doing so, I realized that now a big number of Vector objects is being created so I added a lot of vector-specific methods in Vector class and implemented some decent vector pooling system - now the vectors are being reused so much less Vector objects has to be created.

The drawback is that I had to spend a lot of time for doing this and now I have to remember about returning unused vectors to pool if I want everything to work well. Moreover when I incidentally return to pool the object that is still being used, a lot of hard for debugging problems can take place.

So summing everything up, I’ve put a lot of time and effort and made my life harder and whats the profit? Maybe using vectors is more object-oriented and code that is using vectors is shorter and less cumbersome but still if I had stayed by int pairs, I haven’t need to implement pool system and vector algebraic functions and everything would work faster (while using pair of ints there would be no need to create special custom object like Vector at all and no need to invoke vector’s getters and setters all the time) and were much simplier.

I’m wondering what is your opinion on that? Wchich approach do you personally prefer and how could you ground it?

Riven · March 3, 2009, 10:15pm

Did you actually measure whether the creation of Vector2’s was slowing down your game?

Most likely there are other bottlenecks in your code, so that even if pooling your Vector2’s would be twice as fast, you still wouldn’t notice in your game, as, for example, drawing all the images takes 99% of your CPU cycles anyway.

DooM77 · March 3, 2009, 10:27pm

Nope, I haven’t done any profiler measures so far. Anyways, from what you’ve said I can understand that it’s worth to use vectors instead of int pairs when posibble because it’s more elegant and object oriented and gives no real performance penalties, because it will never be the bottleneck if drawing on screen and other expensive activities are performed. Am I right?

Riven · March 3, 2009, 10:42pm

I’m just saying the bottleneck is probably somewhere else.

If you’re running a physics engine with lots of bodies however, the bottleneck is probably in object allocation.

Measure it.

And remember: premature optimisation is the root of all evil.

ewjordan · March 4, 2009, 2:08am

+1 on that. #1 best way to optimize physics is to get rid of every damn vector creation, but that’s just about the only place I’ve ever seen it matter.

I agree completely. To give you a real world example of when you might have to worry, in physics applications I tend to find that vector allocations start to be a problem when you are doing over 30,000 or so per frame. Less that that, it’s usually not an issue. If you’ve got less than 10,000 per frame, any time you spend reducing those allocations is absolutely 100% wasted, don’t even bother thinking about it. Check out http://jbox2d.org/v2demos/ - JBox2d hasn’t completely eliminated allocations, so I put a count of Vec2 creations per frame, and different demos have various average numbers of allocs (hit right arrow to move between examples), so you can get a sense how it relates to performance. A lot of the performance drain in that is due to graphics, but if you run a demo with 50k allocs per frame and it’s ticking at 60 hz, you know that you don’t have to worry about 50k allocs in your application.

Again, though, I’ve never seen a non-physics application that needs 50k vector allocations per frame, so that’s probably not your bottleneck.

Demonpants · March 4, 2009, 2:52am

The main reason I use vectors:

Vector functions.

I have written my own vector class that has all the notable and useful calculations you might want to perform with a vector, like dot product, cross product scale, divide, etc. etc. If you’re doing any sort of vector math, having this is a must.

If you’re simply doing Array access, the only real reason to use a vector (although I would call what you have a Pair, not a Vector) is to have logically related information contained in one place. So your entity can have a Vector position variable instead of a positionX and positionY float.

So in conclusion if you’re just accessing X/Y pairs in an array, you don’t really have vectors and should be calling it something else. A vector is this --> or this <---- . It’s a directional line segment, in a way. They are very very useful in almost any movement/graphical calculation, but not especially important for array access or anything like that.

jezek2 · March 4, 2009, 5:30am

Good way to find out if allocations are affecting you too much is to enable GC logging. Use these arguments for JVM:

-XX:+PrintGC -XX:+PrintGCTimeStamps

This will show you when GC is occurring including timestamps. You can then count how many of them are per second, if just few, you’re perfectly ok, hotspot can handle many many short-lived allocations.

When working on JBullet physics engine, there were about 10s (maybe even 100s, didn’t count exactly) of minor GCs per second, and hotspot actually did well even at that rate. For final application you could leave that probably, but as JBullet is a library, you must need to count for other libraries that would be used in final app, and then it could be really over limit.

So I did a library for easy to use stack allocation (well really object pooling in heap, but behaving like stack), it has really simple and easy to use API, the only downside is that it needs bytecode instrumentation step in your build script (ant task provided). It’s available here.

Riven · March 4, 2009, 5:50am

There is a different between allocation time and GC time.

GC time can be minimal (if most of your objects are in Eden) even when the allocation time overhead is significant.

pjt33 · March 4, 2009, 9:41am

A lot of the time I’ll use a single int and pack x and y into it as val = x + y * WIDTH. I’ll also use a single-dimensional array. You need to know the width of the grid you’re representing, but it’s easy to extract stuff from the array, easy to pass around, easy to pass back from a method (which is the major drawback of the pair-of-ints approach), and doesn’t require any allocation.

DooM77 · March 4, 2009, 9:56am

First of all, thank you guys for a great feedback, it was actually very helpful for me.

It’s an interesting approach. Actually easy returning restult from methods was important reason for me to use vectors. However operations on packed coordinates may be quite cumbersome (you have to do unpack-process-pack cycle each time) and hard to read. But I will remember about such a possibility.

[quote]Measure it.
[/quote]
For what I was able to do with a simple profiler I could only find out that there was almost no difference between taken heap memory size with vector pooling system enabled and disabled. Moreover it seems that at this point application with vector pooling enabled seems to need a little more memory (cached vectors are locked for GC) though. I need to learn how to use more advanced profilers to check how it influences CPU business.

For clearing it up for me: heap memory is where objects (instances of classes) are residing and non-heap memory is where classes are loaded to, tu put it simply. Am I right?

[quote]I have written my own vector class that has all the notable and useful calculations you might want to perform with a vector, like dot product, cross product scale, divide, etc. etc. If you’re doing any sort of vector math, having this is a must.

If you’re simply doing Array access, the only real reason to use a vector (although I would call what you have a Pair, not a Vector) is to have logically related information contained in one place. So your entity can have a Vector position variable instead of a positionX and positionY float.
[/quote]
Yes, I know about it, and I’ve implemented some of vector operations in my Vector class. I put such examples where it seems that I’m treating vectors just as a pair of int for indexing arrays only to make example possibly the simplest. However I/you could with equal result implement these vector operations not in Vector class but in some static methods/singleton utility class taking (int,int) as agruments ant performing equvalent operations and all these operations would be kept in one place. That wouldn’t enforce us to creation of Vector objects unlike keeping these operations in Vector class. So as I can quess, it’s all about the clarity ant good style of programming.

jezek2:

Good way to find out if allocations are affecting you too much is to enable GC logging. Use these arguments for JVM:
-XX:+PrintGC -XX:+PrintGCTimeStamps
This will show you when GC is occurring including timestamps. You can then count how many of them are per second, if just few, you’re perfectly ok, hotspot can handle many many short-lived allocations.

Since my game is in early development stage, garbage collections in generated log are really rare. There are some at game initialisation stage and later only when I enforce the game to perform some more activity through input (regardless of pooling system being enabled/disabled). I think that there has to bo some smaller GC’s that aren’t logged because it’s impossible for me that GC can be done so rarely. Forgive me my ignorance, but could you explain me this log syntax? When we take such a line:

[java] 1.580: [GC 1522K->707K(5056K), 0.0025902 secs]

My guessing is that it means: minor (as opposite to “full”) garbage collection has taken place after 1.580s of application start, it decreased a heap memory from 1522K down to 707K (but what is that 5056K?), this garbage collection lasted and stopped program execution for 0.0025902 seconds. Is my guessing OK? Also thanks for the ling to library, maybe I’ll find it useful in later development.

Just another ignorant question… What does it mean that objects are in Eden? I am really interested.

[quote]I agree completely. To give you a real world example of when you might have to worry, in physics applications I tend to find that vector allocations start to be a problem when you are doing over 30,000 or so per frame. Less that that, it’s usually not an issue. If you’ve got less than 10,000 per frame, any time you spend reducing those allocations is absolutely 100% wasted, don’t even bother thinking about it. Check out http://jbox2d.org/v2demos/ - JBox2d hasn’t completely eliminated allocations, so I put a count of Vec2 creations per frame, and different demos have various average numbers of allocs (hit right arrow to move between examples), so you can get a sense how it relates to performance. A lot of the performance drain in that is due to graphics, but if you run a demo with 50k allocs per frame and it’s ticking at 60 hz, you know that you don’t have to worry about 50k allocs in your application.

Again, though, I’ve never seen a non-physics application that needs 50k vector allocations per frame, so that’s probably not your bottleneck.
[/quote]
Thanks a lot! Finally I’m getting some picture in my mind and intuition about cost of creation of small objects. And you’re right, I kind of wasted my time trying to optimise it but it was a chance to learn a lot about performance issues.

Again, sorry for my noobish questions and thanks for a great help.

Orangy_Tang · March 4, 2009, 10:39am

Heap memory is divided up into different “generation”, and objects are placed within the generations based on how long they’ve existed for. New allocations are made out of the Eden generation, which is for short-lived objects. Periodically Eden will be garbage collected and any objects still live will be promoted to the next generation.

The idea with this is that it’s quicker to garbage collect a smaller area of memory, especially when most of the contents will be dead and can be removed. So your temporary vector objects will get wiped before they ever get out of the Eden generation, making the whole process very fast. The above is massively simplified, google for more.

IMHO, if you’re expecting to run on 1.5 or later then the garbage collector is freaking fast. While I wouldn’t encorage throwing objects around willy and, indeed, nilly, unless you know you’re creating an object in a really tight inner loop then go with whatever is easiest to write now and come back and change it later.

Edit: Oh, and I wouldn’t ever try and pool vector objects myself, just because the bugs it can throw up are hideous and hard to track down. If I really need a vector or temporary object but don’t want the overhead I usually just make a static object and allocate it once and reuse it. While that still can go wrong it tends to produce much more localised bugs and be much easier to track down.

Riven · March 4, 2009, 3:54pm

The Garbage Collectors in just didn’t scale very well, if your heap got larger than a a few GB.

Hurray for the new G1 (GarbageFirst) collector, it almost scales infinitely, but is a tad slower on average, yet reduces the full GC pauses, in ‘normal’ sized heaps.