Optimizing a QuadTree for enormous maps

Riven · April 30, 2013, 8:18pm

Cas won’t let me write optimized code anymore :emo: :point:

Only worry about noticeably slow code, not inefficient code in general.

princec · April 30, 2013, 10:25pm

Thought I’d chip in… quadtrees are rather a lot slower than grid cells. They might be more useful when you’ve got entities which vary in size from “huge” to “tiny” but if they’re all within a certain size range, grids are the way to go.

Cas

Danny02 · May 1, 2013, 8:43am

@riven
that what you describe is excatly what these loose quad/oct -trees are all about^^

update:

here my little experimental loose quadtree implementation.
beware^^:
-it’s in scala
-the project is just a mix of multiple experiments, you only need to look at the one file
-knearest search isn’t implemented only collision check(circle or AABB)

Riven · May 1, 2013, 9:57am

your point being?

cartel · May 1, 2013, 4:22pm

Hey, sorry, I had to run off yesterday. That time of day where they pay me was calling.

Riven, sorry if I have come off as being a bit petulant. But I feel like I’ve been basically told to shut up, when I have a question that is reasonable. Yes, as people have suggested, my interests are more in the theory behind game development than in an actual, practical game. I’ve got a summer between semesters, no relationship with any artist, and little likelihood of finishing a game that can be released in any meaningful sense. But I am still interested in this world of coding, and I am curious about how space is managed in a theoretical game where the maps are large and the entities are arranged in a possibly degenerate way. Yeah, that’s more theoretical than practical.

And maybe that makes my interests off-topic for this board. Fair enough. But my question remains, quadtrees are used in real games by real industry developers, they are used in modified and optimized ways, and discussion about how they are optimized is not lost in some bizarre realm of fictitious coding. If anyone has access to articles or documentation about this particular usage of this particular data structure (which I KNOW has to be out there somewhere) I’d appreciate it, but if not I feel no deep regret letting this thread die a natural death.

I’d be interested to see any journal articles, textbook examples, or other published discussion of how the standard textbook quadtree is optimized for performance in scenarios that actually arise in real games. I’m sure there are industry writers who have covered this, but industry doesn’t often let its knowledge out to the public. There’s probably a good academic discussion as well, but academia is barely any better about being publicly accessible. So I was wondering if anyone around here had had personal experience with the ways this particular data structure is optimized for real-world use.

Anyway, I don’t want to come off as a jerk. I’ve got some other avenues I can follow for information about how academics and professionals have optimized a particular data structure, and I’ll let this thread drop. I code toy games compulsively and I’ve got a little platformer / adventure project going in addition to my more theoretical Massive 2d game, so I’ll probably have something less abstract to contribute to the forum in the future.

Thanks to everyone. Sorry if I’ve stepped on toes.

princec · May 1, 2013, 5:20pm

69WbIEEs288

Heheh

Back to theory:

in this theoretical system with infinite size you have a requirement to test for collisions between a theoretically infinite number of entities in some bounded time. A little thinking here would tell us that you cannot practically have infinite anything in a game, so you have to nail down both infinity requirements to something more realistic and put some actual numbers on it. Then we’re on to the questions of how much time?, how many entities, and how large is the space in which they live?, with the optional question of what is the size range that these entities occupy, relative to the universe? So let’s have these figures thrashed out in concrete first.

Cas

Roquen · May 1, 2013, 6:17pm

Region quad-tree’s are greedy…there is no re-balancing. As I mentioned earlier, uniform grids and region quad-trees are related…so it’s common that a technique useful for a UG to be equal useful to a RQT. Now it’s highly likely that by a wide margin the most important spatial query to optimize is collision detection. So how to store an entity in a single cell in a UG and perform collisions? The easiest IMHO is to consider the upper-left extent (in a AABB manner) and store the entity there. Then perform a grid-based sweep and prune:

Quoting myself from this thread:

This applies to RQT in the same manner, you just walk all of the leaves (actually the spatial ordering on the walk doesn’t really matter…to just more clear in the UG case as presented). Of course “the devil’s in the details” and this can be coded up in various ways and there are variations on the same theme.

This scheme does not work for other spatial queries however. And some auxiliary data needs to be stored to handle them. How? Depends on a number of factors…and of course this is just a single possible direction.

Roquen · May 2, 2013, 9:21am

Let me semi-address your question of references. The trick is that I’m not aware of much “new” that pertains to the problem. The speed issue of UG and RQT is solely an implementation detail related to being cache obvious. Back when I had anything to do with academia (and dinosaurs walked the earth) the references were by Hanan Samet. You’re probably better off just spending some time on citeseer, arxiv and/or mendeley. You kids today have no idea how lucky you are to have free access to tons of research papers. A quick search yielded this http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.148.4652 which appears (from a 2 minute skim of 160 page paper) to be a reasonable to get an overview of some various methods.

princec · May 2, 2013, 9:51am

(Remember to be generous with that “appreciate” button, everyone!)

Cas

Riven · April 30, 2013, 2:58pm

Traversing (quad) trees is incredibly inefficient due to cache misses.

Anyway, I’m looking at it from a totally different angle: which space partitioning datastructure you use does not depend on scale, it depends on the number of entities. (not even their distribution, in this case)

If your entire universe has 10,000 entities, and you can make a reasonable assumption that they are not all clustered together, you’re best off with a simple grid of, say, 100x100 cells, so that in the average case, there is 1 entity per cell, and in worst case, maybe 100.

Just ask yourself: can my engine handle more than 10,000 entities with moderately complex AI and pathfinding at 60fps anyway? Most likely not, so that is your bottleneck, not your space partitioning and collision detection.

If you want to finish a game, pick your battles, don’t fight them all.

cartel · April 30, 2013, 3:11pm

I will have:

A very large game world
Many entities
Dynamic entities moving between regions
Large discrepancies in how densely populated the world is, there will be regions thousands of viewports wide with no entities at all, and there will be regions a few viewports large with many hundreds of entities. I can’t assume an ideal or normalized distribution.

The objection to a quadtree that people offer in every thread that it comes up in is that it’s too hard to code and a grid works just fine 95% of the time. I am starting with an engine that is designed from the ground up to work with tens or even hundreds of thousands of entities distributed unevenly across a map that is hundreds of thousands of viewports large, or infinite if I can figure out how to do that.

I really don’t think a grid is appropriate, and the argument for a grid tends to run “it’s easier to code.” I don’t mind working with a somewhat complicated data structure. I’m a Computer Science student, I did well in data structures, and spending some time optimizing a quadtree is not something I’m afraid of. A little bit of logical puzzling appeals to me.

So I’ll reiterate my question: How best to optimize a quadtree for large numbers of dynamic rectangular entities arranged in an uneven distribution across a vast space, beyond the simple textbook implementation presented around the web?

Roquen · April 30, 2013, 3:15pm

Spatial partition is a daunting topic. The most important question really isn’t the size of the world or how many entities there are in it, but how are the entities expected to be distributed in the world. The more uniform the distribution will point toward some uniform grid based, the more clumped up points to . In the case of “big” worlds, where big isn’t really very big, you really don’t want to use a single “logical” data-structure but some forest of them. And if really big, then you want to do that anyway for precision reasons. It’s always possible to use multiple techniques to match the local distribution.

Somewhere in the recent(ish) past we talked about a huge world where virtually all the entities are going to be clumped near planets, etc. There’s zero reason to have any structure to represent empty space…just use some other technique (like the equivalent of an animation hierarchy) to represent how these local spaces are related to one another.

On everything in leaf nodes: This is so you virtually never need to walk the tree (instead of always). The problems this creates is identical to what you have when using a uniform grid. This in itself is a pretty big topic.

On matching a uniform grid to a quadtree. When they have an common edge, then a uniform grid is implicitly a region quad-tree of a given height.

No need. The only entities that need to regularly run AI are near the player. The key word for things far away (that actually “need” to think occasionally) is “cheat”.

Yes: assuming the implementation ignores modern memory architectures. Working around this is a major PITA…esp in java without structs & arrays-of-structs. Devil’s advoate mode: I’d guess that most uniform grid implementations don’t bake entity coordinates and extents directly in the node to prevent the need to (randomly) access the entity itself during broad-phase…nor lay-out memory is a friendly manner, say Z-curve.

relminator · April 30, 2013, 3:25pm

[quote]So I’ll reiterate my question: How best to
optimize a quadtree for large numbers of
dynamic rectangular entities arranged in
an uneven distribution across a vast
space, beyond the simple textbook
implementation presented around the
web?[quote]

You shoul not use quadtrees or grids for this. A better solution is a kdtree.

Quadtrees are almost useless in this case since they’re unbalanced. It’s not much harder to code a kdtree (2 child nodes) if you already can code a quadtree.

cartel · April 30, 2013, 3:38pm

OK, I’m looking at K-D Trees and this looks interesting, but I can’t find any reference that talks about how to use a K-D Tree to store rectangular regions rather than points. Can that be done?

Roquen · April 30, 2013, 3:38pm

various flavors of k-d trees can be nice. But you really can’t make sweeping statements like this about any data-structure without a ton of details. Why a kd-tree instead of say a vantage-point tree? Or a non-region quad-tree?

relminator · April 30, 2013, 3:51pm

I did not make a sweeping statement. I just gave him what I thought is a better data structure. I’ve never heard of a vantage point tree.

@cartel: For simplicity, I just put a any region intersected by the node. You will have duplicates this way but it’s not much of a bottleneck. Just don’t make your regions overly big as to intersect more than 2 nodes on non-corner intersections.

cartel · April 30, 2013, 3:54pm

OK, Roquen, toward the one optimization that has been actually stated in this thread, pushing all rectangles to leaf nodes, how does this work in relation to the tree implementation linked above?

What if an entity overlaps multiple leaf nodes? If the leafs are particularly small, a single entity could overlap even more than 4 nodes. How does the algorithm move entities between different nodes? What happens when nodes become empty? Is there any way to rebalance the tree?

relminator · April 30, 2013, 3:59pm

@Cartel: That’s why I posted a kdtree. Kdtrees are “balanced” quadtrees.

Riven · April 30, 2013, 4:00pm

Please read carefully what I said: I didn’t suggest your entities were uniformly spread, I made the assumption that they were not all clustered together. That means that some grid cells can be 100x as occupied as others, but given that 1 cell with 500 entities is still plenty fast to compute quickly, there is no need to subdivide the world even further. Therefore grid cells are adequate for your use case.

As far as I can see, your proposed game falls right in that ‘95%’ category.

You can’t do that, besides there is no joy in games of infinite size. Nobody appreciates it that your map is huge, if it takes three reallife weeks to travel to the nearest neighbour. Better set yourself some realistic targets, instead of wanting it all, for no gain.

Sure, it is interesting to experiment, and figure out performance characteristics of different data structures and algorithms. It’s just that your specific use case does not need anything more fancy than a grid, and you could (should?) spend your valuable time designing the game, making it work and finally making it fun, instead of making it 1% faster by using the cleverest algorithm imaginable.

Roquen · May 3, 2013, 12:54pm

I just ran across this: http://www.dtecta.com/files/GDC13_vandenBergen_Gino_Physics_Tut.pdf