Optimizing a QuadTree for enormous maps

Hello everyone, I’ve been lurking on this board for a bit and now I’d like to post a discussion topic, I hope I’m not making any faux-pas. I don’t think I’m stepping on any toes.

There have been discussions elsewhere about Quadtrees vs. Grids for optimized collision detection and the opinions of people who seem to know what they are talking about is:

  1. Grids save coder effort vs. Quadtrees
  2. Quadtrees are only useful for enormous maps with tons and tons of entities.
  3. The basic Quadtree implementation that you see on Wikipedia, forums, and elsewhere is unoptimized and modern quadtrees in real games are tuned to reduce the search size when looking for potential collisions.

I’ve got a few ideas for games that involve massive generated worlds, 2d spaces that will have coordinates into the millions, hundreds of millions, even potentially “infinite” if I can work out a good algorithm for an infinite map. Imagine something like a 2d EVE or Skyrim, so I’m working on an engine that will let me experiment and I’m definitely going to need collision detection across an enormous area. I think I need a quadtree.

So my question is what resources exist for optimizing giant quadtrees (specifically a region quadtree that looks for collisions between rectangular entities and map squares. I’ve heard it helps to keep everything in leaf nodes, but I don’t see how that’s possible. There seems to be an opinion that if there are really large numbers of entities moving around, rebalancing is needed, and in general handling entities moving from quadtree node to quadtree node dynamically isn’t discussed in the public presentation of trees that you see.

Second question, lets say my map consists of a regular arrangement of equally sized squares, perhaps broken down into a grid of regions that are dynamically generated as the player explores. What’s the best way to check for collisions between entities in the quadtree and grid squares. Should the map grid just be in the tree instead of in a separate grid? That seems like it could get out of hand for a huge map, but maybe I’m not thinking about the equal space and search requirements for a big enough grid and I should just keep all collision / proximity testing in the same data structure. Eventually it seems like there would be a need to load only parts of the map and corresponding areas of the quadtree at any one time.

As a reference point I’m looking at this quadtree implementation as an example of a “typical” unoptimized tree:
http://code.google.com/p/slick2d-tests/source/browse/SlickTutorials/trunk/src/de/myreality/dev/slick/collisions/Quadtree.java?r=3

Though there are others out there that are mostly similar. It does no rebalancing and can require a very deep search comparing many entities of many different levels. Also if the bounds on the outermost tree node are very large (say 2 ^ some significant power) and there are many entities clustered in a small local area, the tree will have to be divided into many many sublayers for a small number of entities and many nodes will be used to hold large empty areas.

Ideally, there’s a discussion in a journal, magazine, or textbook about this topic. I have access to a university library.

Thanks!

If your game level data is made of just axis aligned tiles, you don’t need quadtrees or grids to selectively render viewable tiles.

For dynamic entities, grids should be better. Quadtrees are fast to traverse but slow to make.

I’ve never used quadtrees for entity to entity collidions. Quadtrees shine when you want to collide a moving entity against a static level.

Everything useful in a quadtree are in its leaves. Non-leaf nodes are only there for traversal.

I’ve come across an article at gamedev.net about infinite terrains that is quite good it was used on a ps2 flight game. Forgot the link though.

Quadtree demo:

rel.phatcode.net/junk.php?id=89

Octree renderer:

rel.phatcode.net/junk.php?id=88

A visual tutorial of how these things are made:

rel.phatcode.net/index.php?action=contents&item=Tutorials

Scroll down. Sorry I’m using a phone to post this. Hard to get links.

Traversing (quad) trees is incredibly inefficient due to cache misses.

Anyway, I’m looking at it from a totally different angle: which space partitioning datastructure you use does not depend on scale, it depends on the number of entities. (not even their distribution, in this case)

If your entire universe has 10,000 entities, and you can make a reasonable assumption that they are not all clustered together, you’re best off with a simple grid of, say, 100x100 cells, so that in the average case, there is 1 entity per cell, and in worst case, maybe 100.

Just ask yourself: can my engine handle more than 10,000 entities with moderately complex AI and pathfinding at 60fps anyway? Most likely not, so that is your bottleneck, not your space partitioning and collision detection.

If you want to finish a game, pick your battles, don’t fight them all.

I will have:

A very large game world
Many entities
Dynamic entities moving between regions
Large discrepancies in how densely populated the world is, there will be regions thousands of viewports wide with no entities at all, and there will be regions a few viewports large with many hundreds of entities. I can’t assume an ideal or normalized distribution.

The objection to a quadtree that people offer in every thread that it comes up in is that it’s too hard to code and a grid works just fine 95% of the time. I am starting with an engine that is designed from the ground up to work with tens or even hundreds of thousands of entities distributed unevenly across a map that is hundreds of thousands of viewports large, or infinite if I can figure out how to do that.

I really don’t think a grid is appropriate, and the argument for a grid tends to run “it’s easier to code.” I don’t mind working with a somewhat complicated data structure. I’m a Computer Science student, I did well in data structures, and spending some time optimizing a quadtree is not something I’m afraid of. A little bit of logical puzzling appeals to me.

So I’ll reiterate my question: How best to optimize a quadtree for large numbers of dynamic rectangular entities arranged in an uneven distribution across a vast space, beyond the simple textbook implementation presented around the web?

Spatial partition is a daunting topic. The most important question really isn’t the size of the world or how many entities there are in it, but how are the entities expected to be distributed in the world. The more uniform the distribution will point toward some uniform grid based, the more clumped up points to . In the case of “big” worlds, where big isn’t really very big, you really don’t want to use a single “logical” data-structure but some forest of them. And if really big, then you want to do that anyway for precision reasons. It’s always possible to use multiple techniques to match the local distribution.

Somewhere in the recent(ish) past we talked about a huge world where virtually all the entities are going to be clumped near planets, etc. There’s zero reason to have any structure to represent empty space…just use some other technique (like the equivalent of an animation hierarchy) to represent how these local spaces are related to one another.

On everything in leaf nodes: This is so you virtually never need to walk the tree (instead of always). The problems this creates is identical to what you have when using a uniform grid. This in itself is a pretty big topic.

On matching a uniform grid to a quadtree. When they have an common edge, then a uniform grid is implicitly a region quad-tree of a given height.

No need. The only entities that need to regularly run AI are near the player. The key word for things far away (that actually “need” to think occasionally) is “cheat”.

Yes: assuming the implementation ignores modern memory architectures. Working around this is a major PITA…esp in java without structs & arrays-of-structs. Devil’s advoate mode: I’d guess that most uniform grid implementations don’t bake entity coordinates and extents directly in the node to prevent the need to (randomly) access the entity itself during broad-phase…nor lay-out memory is a friendly manner, say Z-curve.

[quote]So I’ll reiterate my question: How best to
optimize a quadtree for large numbers of
dynamic rectangular entities arranged in
an uneven distribution across a vast
space, beyond the simple textbook
implementation presented around the
web?[quote]

You shoul not use quadtrees or grids for this. A better solution is a kdtree.

Quadtrees are almost useless in this case since they’re unbalanced. It’s not much harder to code a kdtree (2 child nodes) if you already can code a quadtree.

OK, I’m looking at K-D Trees and this looks interesting, but I can’t find any reference that talks about how to use a K-D Tree to store rectangular regions rather than points. Can that be done?

various flavors of k-d trees can be nice. But you really can’t make sweeping statements like this about any data-structure without a ton of details. Why a kd-tree instead of say a vantage-point tree? Or a non-region quad-tree?

I did not make a sweeping statement. I just gave him what I thought is a better data structure. I’ve never heard of a vantage point tree.

@cartel: For simplicity, I just put a any region intersected by the node. You will have duplicates this way but it’s not much of a bottleneck. Just don’t make your regions overly big as to intersect more than 2 nodes on non-corner intersections.

OK, Roquen, toward the one optimization that has been actually stated in this thread, pushing all rectangles to leaf nodes, how does this work in relation to the tree implementation linked above?

What if an entity overlaps multiple leaf nodes? If the leafs are particularly small, a single entity could overlap even more than 4 nodes. How does the algorithm move entities between different nodes? What happens when nodes become empty? Is there any way to rebalance the tree?

@Cartel: That’s why I posted a kdtree. Kdtrees are “balanced” quadtrees.

Please read carefully what I said: I didn’t suggest your entities were uniformly spread, I made the assumption that they were not all clustered together. That means that some grid cells can be 100x as occupied as others, but given that 1 cell with 500 entities is still plenty fast to compute quickly, there is no need to subdivide the world even further. Therefore grid cells are adequate for your use case.

As far as I can see, your proposed game falls right in that ‘95%’ category.

You can’t do that, besides there is no joy in games of infinite size. Nobody appreciates it that your map is huge, if it takes three reallife weeks to travel to the nearest neighbour. Better set yourself some realistic targets, instead of wanting it all, for no gain.

Sure, it is interesting to experiment, and figure out performance characteristics of different data structures and algorithms. It’s just that your specific use case does not need anything more fancy than a grid, and you could (should?) spend your valuable time designing the game, making it work and finally making it fun, instead of making it 1% faster by using the cleverest algorithm imaginable.

Relminator, OK, is there example code or pseudocode out there for using a K-D Tree to store regions? It seems like a pretty large divergence from the way K-D Trees are presented around the web. Handling dynamic entities that are moving in and out of tree nodes frequently and overlapping multiple nodes is a problem that I don’t immediately see the solution to.

Riven, if my question annoys you, you can go somewhere else. Trees are used for collision detection in professional games, this is not a purely academic question, and optimization of such trees is a problem that has been well studied, I just can’t find the publications that discuss that optimization. Asking how best to optimize a spatial tree is a perfectly on-topic question for this forum. You don’t know the answer and rather than just not comment, you insist that I shouldn’t want to know the answer myself.

The quadtree and octree example I posted above has a basic source on how to store data at the nodes. That’s also the way how to store data on a kd tree.

Riven’s posts are actually good ideas. Your entities are dynamic(moving) every frame. While quadtrees are fast at querying where an entity is, building it in realtime will take a veeeery long time.

His suggestion of using grids for moving entities us the best posted so far.

Btw, building a kd-tree in realtime is also very slow.

just giving you some more ideas:

  • use a quadtree with a fixed depth, which enables you to get insertion time of O(1)
  • take a look at loose quadtrees(you will probably only find something for loose octree), which gives you a much better balanced tree

also abstract this implementation details away, so you can program in a way dependent from using either just a list a grid or quadtree. So you can just swap out different things to compare them

I have a 90% finished loos quadtree lying around if you want to take a look.

You are new here and Riven runs this place. He probably knows, what you wants to know, but he surely is more experienced than you - so maybe you should be listening and thinking about what he says.

But maybe you are not interested in finishing a game and more into “technology-wanking” - that’s OK, but you should be aware of that…

Despite people defending me (or the points I make), I think I prefer the role of ‘one of em’ rather than ‘the cute fluffy overlord’. The fact that I run this forum doesn’t really mean that I’m right, or need to be taken more seriously than ‘the new guy’ :slight_smile:

Having said that, I’ve written many quadtree and octtree implementations, that worked well, scaled way beyond my needs, but nothing was as performant and easy to write as grids.

IMHO spatial data structures like trees only really make sense in 3D games these days, as that additional dimension is a complete game changer, distribution wise.

I just came up with this approach for ‘keeping all entities in the leafs’, regardless of cell boundaries, provided that the entity bounding box is at most half the size of a grid cell:

Imagine this uniform grid, with empty cells:

Now imagine this uniform grid, with a few entities:

You’ll see that their bounding boxes sometimes overlap the boundaries of the grid cells.

How can we solve this?

We can move the grid, half a cell-size along an axis, to account for the overlap:

Now every possible entity, can fit in a cell of one of these four grids with the offsets:

  1. offset: (0,0)
  2. offset: (d/2,0)
  3. offset: (d/2,d/2)
  4. offset: (0, d/2)

These are the four grids rendered (roughly) on top of eachother:

http://indiespot.net/files/published/entity-grid-four.png

Inserting an entity in the datastructure, is trivial:

  1. put entity in cell in the black grid, if it fully fits in a cell
  2. if not, put entity in cell in the red grid, if it fully fits in a cell
  3. if not, put entity in cell in the green grid, if it fully fits in a cell
  4. if not, put entity in cell in the blue grid - it will fit!

For posterity, I call this mindblowing trivial approach:
the axis aligned offseted grid set (AAOGS for short) :persecutioncomplex:

(it even works in 3D)

To stay on topic: you can use this approach with (quad) trees too, although it’s a tad harder to generate the (four) sets of sub-nodes for each node, as they may define the same area as the offset sub-nodes sets of a neighbouring node. Knock yourself out, I’d say.

Just know that entities high in the hierarchy of the tree will drag down the performance of the whole datastructure. You should make very sure that every entity is as deep as possible in the tree. (with a grid, this is trivial, for obvious reasons)

Btw are there any nice open source examples for something like this ?
Because when writing grids I’m always worried I may not do it as fast as possible.