Avoiding looping through everything

Well… I think, entity-wise, Starcraft and Revenge of the Titans are roughly similar - and I had to pull a few “tricks” like this to get Revenge running at 60fps. Even on my beast of an i7 in survival mode (which is when I first discovered the performance limitations of quadtrees)

Cas :slight_smile:

Riiight. So let’s say your extremely simplified code can be used as a baseline… (it can’t be, but let’s just assume it is)

If you scale it back to 200 units, it takes 7.24ms per frame. At 60Hz that’d be 434ms per second.

So to wrap up, even with only 200 units, you’d be wasting enormous amounts of CPU cycles. When you’d have 2 players with each 200 units, you’ll spend 868ms per second purely on basic visibility checking, while it could easily be 5ms instead.

My point was to Cero. Its same if you loop units and check if they “collide” with towers or loop towers and check if they “collide” with units. Same amount of method calls. Grid aproach is really nice but I would want see the benchmark where there is 200units with bruteforce vs grid aproach. Grid algorithm scale much better but what is break even point?

Brute force is n2.

Still brute force is pretty fast with(only) 200 units and 200towers would be: 200*200 = 40000. Its hardly even costly. Was bored at work and I clocked that 10million range checks cost only 900ms.

It all depends on just how crappy a machine you want it to run on and what else you need to accomplish inside a frame. But still, it’s fun doing things the fastest way if you can :slight_smile: Revenge ended up having upwards of 1000 entities in it which is half a million collision checks with brute force, definitely way too slow, versus maybe only about 200 or 300 (yes, literally) when using grid cells. In other words I turned the absolute bottleneck which dragged my game down to 10fps into something that was so fast I couldn’t even measure it any more.

Cas :slight_smile:

Out by a factor of 10 :wink:

10 million checks would only be ~3K objects. The approaching 1 sec time you got would most likely explode in a real test due to cache trashing.

At our not yet published tower defence game units don’t collide with each others so there is no problem. Map size is upper limit for towers anyway so there is only 300towers and maybe 500units max. So 150 000 collision check maximum. This is still so deadly fast and use only 15micro seconds with my benchmark. Still might need to do it with grid algorithm when we are doing android version.

There’s not just towers but also projectiles to worry about too don’t forget. But yes, 150,000 collision checks isn’t really a lot, unless you really need those cycles for something else. Anything under a couple of milliseconds is going to leave you plenty of CPU to do graphics and other game logic. Or of course you can cheat at run at 30fps :wink:

Cas :slight_smile:

I silently assumed that this is dependent on cell transitions per frame, but thinking about it now, it seems unlikely that my approach is faster in a real world scenario. Maybe it could be made faster in certain cases when using a list implementation specially designed for fast add/removal with the help of storing the entities’ index in the entity itself (for faster lookup on removal), but it probably might not be worth it…

Looking at the source code of ArrayList

There appears to be quite different methods between remove object and remove index.
Would there be a way to perhaps grab the particular objects location and is the remove index number any faster than remove object function?


public E remove(int index) {
    rangeCheck(index);
 
    modCount++;
    E oldValue = elementData(index);
 
    int numMoved = size - index - 1;
    if (numMoved > 0)
    System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // Let gc do its work
    return oldValue;
}

public boolean remove(Object o) {
    if (o == null) {
        for (int index = 0; index < size; index++)
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        for (int index = 0; index < size; index++)
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}
 
private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // Let gc do its work
}

Initial tests of temporarily saving the index location during the iteration process appears to be about 5-10% faster.
myArrayList.remove(tempIndexInt) is 5-10% faster //remove int index
myArrayList.remove(this) //remove object

adding items to a removeme Arraylist to later remove that collection from the original list is slow.
adding items to a new list, to simply exclude “to be removed” items is even slower?

However, this isn’t a real world application, just a little test I setup. So I am not really sure how accurate setup is.

well remove(Object o) calls it by itself you edited your post =P

well just clearing works fine aswell

We still cheat with homing missiles or instant beams. If we change this then it might be worth change lot other things too. But have to make gameplay ready and optimize after that. Still running 900fps at desktop so no problems yet(libgdx is that effective.)

Unfortunately the remove index is just one part of the equation. To make this fast, it needs to get rid of data moving to fill the gap of an removal without slowing down addition.

I was more thinking along the lines of an unsorted Bag having a stack of empty slot indices, so you can have something like


add(Collectable obj)
{
  emptyStackPointer--   
  storage[ emptySlots[ emptyStackPointer ] ] = obj
  obj.setSlot( emptySlots[ emptyStackPointer ] ) 
}

remove(Collectable obj)
{
   slot = obj.getSlot()
   storage[ slot ] = null
   obj.setSlot( -1 )
   emptySlots[ emptyStackPointer++ ] = slot
}

This should be very fast, but clearly has problems:

  • it limits a collection to only store specific kind of objects
  • you can only store a Collectable in one collection
  • you need an emptySlots stack the size of the collections’ capacity
  • iterating the contained objects is more difficult

Now this really is premature optimisation! We’re talking about 300 removes and 300 adds per frame - and that’s only for entities that have actually moved! So no turrets, just mobs and bullets. It’s noise.

Cas :slight_smile:

The OP doesn’t mention a certain amount of entities.

Hence it’s perfectly valid to assume it’s massive and everything has to be done to queeze out the last drop of performance :point:

You could just use an array.

Otherwise here is some thought craft. Try storing the index within the object that you are storing. If you disallow random insertions and appending to the beginning (which is quite common for a lot of ArrayList usage), then you only have to update the objects indexes when an element is removed. You can skip that by storing null in an empty position, rather then compacting the array down, and store the location of the null element in an int array (a stack). When you next insert, read from the stack, decrement the size and then place the new object at that location.

If the arraycopy is slower then the stack read + size decrement, then this optimization should end up improving the speed. It does mean you are essentially removing based on reference (by telling giving each instance to hold a unique array index), rather then on equality. Presuming it totally works, I’d put money that any speed improvement is exceptionally trivial. You’d need millions (10s of millions?) of elements before you’d see any real gain.

Any real optimizations also depend heavily on what you are trying to achieve. For example are you iterating over this array to handle collisions? If so then you might consider a Grid or OctTree instead, as you can then avoid large sections of your objects.

Actually the emptySlot stack now seems like a bad idea after thinking about it - the limitations in iterating the contents might just be to costly. Better just use kappas Bag implementation http://www.java-gaming.org/index.php/topic,24203.0.html and store/update the index in the Entity if removal has to be as fast as possible…

Hey, thanks for all the comments on this! I’m not asking this in a particular context, but more as to how the problem should be solved the most efficiently, and keeps that when scaling. It’s not a matter if it makes the time, but what method is fastest, while offering the features needed in a given situation.

I very much like the idea of the grid. With that in mind, it seems wierd that quadtrees should be remarkably slower (at least to me). Can someone explain the reason for that? Also, would octrees be any good compared to quadtrees, in 2D space?

Finding the hypotenuse of a triangle gives pretty accurate results, right? But that expensive Math doesn’t need to take place when comparing the entities left, after we find out what’s in a given square in the grid, does it?


yDis = pos.y - targ.y
xDix = pos.x - targ.x
distance = yDis + xDis

I don’t see immidiate problems with this, and it is (…it should be) faster than doing the actual geometry math? :slight_smile:

EDIT:
Actually, I do! If xDis is negative, it might give unwanted behavior. A Math.abs(xDis), would fix this. Anyone know of the expensiveness (because that’s totally a word), of Math.abs?