Pathfinding over too large grid

Hello.

I’m taking a look at the pathfinder we have for RF, and to my horror I realized that it had a super bad worst case scenario for memory usage. At worse, this memory usage could skyrocket to over a gigabyte of memory! >__<

The problem is that our worlds are really big, up to 7680x5120 or so. The game world is essentially a grid, and the data is split up into chunks of 32x32 tiles that can be paged to disk when they aren’t used. The pathfinder I’m using is a bi-directional breadth first search that searches from both the start and the goal and finds where the searches meet to reduce the number of tiles that get checked. For each tested tile, the tile is checked if it’s accessible, which may require loading in data from disk. The last N chunks are cached in RAM until a new chunk is needed. The pathfinding may potentially need to explore the entire map to find a path (or admit failure), but this is rare (yet still possible).

The problem I’m having is that the pathfinding requires a huge amount of data per tile. For each visited tile, I need to know if it’s been opened by the start search and the goal search, which I store using a job-ID system to avoid having to clear nodes. Secondly, to get neighbors and be able to find neighbors and to backtrack, I need to track the XY position (stored as shorts right now) and two parent references. This all sums up to 28 bytes per tile. Hell, just the array for holding the nodes of the entire world is huge, as 768051204 = 150MBs. The tiles then add an additional 7680512028 = 1050MBs for a total of 1200MBs of raw pathfinding data. It’s insane! The entire point of the chunk system goes out the window. In addition, I cannot only track pathfinding data of loaded chunks as if the search is wide we’d start unloading the chunks looked at at the start of the search (the start and finish), which would mean that we would be unable to backtrack. There’s no way of telling which nodes we’ll need for the final path.

I tried limiting the search area to reduce memory usage. The idea was to preallocate a square area of nodes. This square grid of nodes would then be centered on the average of the start and goal coordinates so that the search could run with “local” nodes. This obviously limited the maximum distance between the start and the goal, and also made it impossible to find paths that need to venture outside the square area of preallocated nodes. In the end, I realized that less than 1000 tiles of search distance in all directions (2048x2048 square search area) was the minimum I could drop it to, and that’s still hundreds of megabytes of data.

The thing is that the number of traversible tiles is pretty low. Most of the world is either ocean, mountains or forest, so if I could eliminate those tiles I could cut memory usage by a lot. However, the 150MBs of memory needed just for the array is a lot, and it’s so useless when a majority of the elements will be null anyway. If I could find a better way to store the pathfinding objects than in an array, I could eliminate a lot of that data. Similarly, if I can avoid allocating 70% of the nodes as they’re not traversible, That would cut memory usage from 1050MBs to ~350MBs. Finally, I could reduce memory usage of the node class from 28 bytes to 20 bytes, another 100MBs saving for a final total of 250MBs. That would be… acceptable.

So I guess my real question is: What’s a good data structure for storing a very sparse/lazily allocated grid? It needs to quickly be able to retrieve a node given its coordinates (or null if none exist). Maybe a custom hashmap that doesn’t use linked lists to resolve collisions? Any other ideas?

So what’s wrong with a sparse grid? It’ll only require a tiny fraction of the memory most of the time.

Cas :slight_smile:

Also why not make life easier for yourself on the job ID idea and instead just used a packed bit array for the closed list. 4MB. Clears so fast that you won’t even notice it. (At the expense of instantly and temporarily ruining all caches)

Cas :slight_smile:

You need search per object – not per tile
Per Octree node object – its eliminate empty iterations
But you have new problem – recreate nodes on every static tile change
– but because its Octree – its fast)

I heard IDA* pathfinding uses a very small amount of memory

A course grain graph of waypoints?

Maybe grid based pathfinding is the wrong algorithm use in this case.

For large worlds it is better to go for a NavMesh based pathfinding solution (like Starcraft 2), memory usage will be massively lower.

I investigated a lot of different solution, from hierarchical systems to waypoint systems, but they all had different problems, ranging from expensive precomputing to not being able to handle dynamic worlds (units can block a tile). In the end I decided to try optimizing the existing implementation and see how far that got me.

The first issue was the memory usage of the array that holds all the nodes. As the worlds we generate are mostly inaccessible terrain like mountains or ocean, a majority of the elements in the array are going to be null. However, a 40 million element array of 32-bit references is still going to use 150MBs, so the single monolithic array had to go. I split up the world into chunks of 16x16 tiles. That left me with a 480x320 chunk array instead, which only uses around half a megabyte of memory instead. This reduced the constant by around 150MBs. Chunk objects are then allocated (and reused) on demand when the pathfinder first enters a chunk for each job. Chunks have all their nodes preallocated, so there’s a tiny bit of inefficiency for the edge around an inaccessible area; if just one tile in a chunk is accessible and explored by the pathfinder, the entire chunk will need to be allocated/reserved.

The second problem was the size of each node. The Node object I used to store the pathfinding data of each tile had the following data:


		int x, y;
		int lastOpenedStart, lastOpenedGoal;
		Node parentStart, parentGoal;

For some crazy reason, this ended up taking up 40 bytes per node, plus 4 byte per node in the arrays used to hold the Nodes in each chunk!!! I count 24 bytes of actual data in there. 24 bytes of actual data taking 44 bytes! I decided to look for ways of reducing the size of the information.

As our max world size is 7680x5120, it’s clear that for x and y we really don’t need the full range of a 32-bit integer. With 13 bits for each of X and Y, we can accommodate up to 8192x8192 worlds, which is just enough for us.

The reason why the lastOpened* values are ints instead of booleans is to avoid having to reset the entire world after each pathfinding job. If those were booleans, I’d need to go through all opened nodes and reset them back to false after each job. By making them ints, I can implicitly “reset” the entire world by simply doing jobID++; once in the pathfinder. Then when I want to mark a node as opened, I do [icode]lastOpenedStart = jobID;[/icode], and to check if a node is opened, I just do [icode]if(lastOpenedStart == jobID) …[/icode]. However, with the chunk system, it became trivial (and necessary anyway) to reset chunks when they’re reused, so such a system isn’t necessary here. Hence, we can compress the two opened fields down to just 2 bits in total.

Lastly, the two parent references are used for backtracking once a path has been found. However, in our simple tile world there are only 4 possible directions you can travel in, so we can store each parent with just 2 bits used to encode which of the 4 directions the parent is in.

This all sums up to a grand total of… exactly 32 bits per node. In other words, my “node array” is currently just an int[] which I pack in all the data into. I can fit an entire node in just 4 bytes per node instead of the 44 bytes used before. In addition, the chunk system means that in most cases the worst case scenario for memory usage is never reached due to more efficient reuse of chunks between multiple pathfinding jobs. The new system averages less than 50MBs, with an absolute worst case scenario at under 75MBs. The improvements in memory locality also almost doubled the performance of the system.

Here’s a comparison normal breadth-first search and bi-directional breadth first search:

Normal:

Bidirectional:

There’s a huge difference in the number of tiles explored by the two (the red area). The normal search explores almost every single reachable tile in the world, while the bidirectional one generally avoids exploring half the world in worst case scenarios. This reduces peak memory usage by quite a bit.

Dynamic terrain and unit blocking is tough. Maintaining the navigation mesh with this is hard.
Will you also incorporate fog of war or line of sight?
If so, you can make units follow dumber paths that explore as they go. This has the benefit of cheaper incremental computations. Riven recommended this to me when I investigated the path finding problem.

Cool pictures by the way 8)