Networking a Dungeon Crawl

kevglass · August 15, 2005, 12:22pm

So, I’m working on another game (surprise). The prototype single player version can be found here: http://www.cokeandcode.com/asd. Its essentially a simple dungeon crawl using the SNES RPG style graphics, randomly generated dungeons. The game play is intended to be like nethack/rogue apart from free form movement and not turn based (although at this rate that might change soon).

I’m having some problems coming up with a networking model to support this, I’ve considered:

Position Based

Just try and sync the players positions and state using game updates. Interpolate positions to smooth out missing bits of positional info and speed up movement where actors/entities/things might be lagging behind the expected. This works pretty well for FPS type games where you can’t see alot of entities all the time, everyone is moving all the time and you can’t judge the positions of things too well. However, for my 2D top down (sort of) game its not working out too well since you see alot of the corrections going on due to the viewpoint.

Deterministic Simulation

Having just had a discussion on #lwjgl, I realise I could atttempt to make everything deterministic at all ends. Make the player interaction command based, introducing enough delay between player action and avatar reaction that everyone runs the same simulation on all machines. This seems like it’d work pretty well but I’d end up having to change the style of the game from a keyboard dungeon crawl into a more sort of point and click affair. Not too keen, but this will be last hope

Any suggestions or points to articles on other possible strategies?

Kev

Orangy_Tang · August 15, 2005, 12:50pm

I’m toying with this idea myself for my current game. Making a game fully deterministic is usually something of a pain, but with Java handling most (all?) of the fp math issues and other obscure ways code become non-deterministic then i think it wouldn’t be too tricky.

Essencial reading: http://zoo.cs.yale.edu/classes/cs538/readings/papers/terrano_1500arch.pdf which covers almost exactly what you’re describing (and I’m planning a variation on).

Problems with this (compared to client-server):

Not secure (this i can live with)
No late joining (may or may not be important)
One person’s bad connection messes with the entire game for everyone else (so if over the internet this limits the number of players to ~4, maybe 8 if everyone’s got a good connection).

Jeff · August 15, 2005, 6:36pm

What kind of artifacts are you seeing? Is it your speed-up thats noticeable? If so you may have to accept the idea that your going to have to carry a higher error term along and not cheat the speed, or at least not by much.

The problem with countign on dterminism is that you will have to eithr:

(a) Go to at least a semi-lock-step solution where the state is not advanced until player inpu thas been recieved by all players This means you run loop is

OR

(b) Deal with trying to regularly “resynch” the games and ensure that any synch drift cannot have an imapct on game logic (not real easy.)

The issue with lockstep is that your execution loop would look something like this;

(i) Poitn and click
(ii) Wait until everyone elses packet for this time step is received
(III) Then calculate and update.
(iv) rinse and repeeat

It will make any latnecy issues visible as a pause btw input and response

I thin kyour already on the right path, you just need to work a bit on your latency hiding…

Kev

Alan_W · August 15, 2005, 6:52pm

I’ve got the same issue with SharpShooter Arena, although being a FPS, it isn’t as noticeable. Currently I’m sending position, velocity, rotation updates at 5Hz. However that means if you take your finger off the forward key just after Tx’ing an update, the remote client doesn’t find out you’ve stopped moving for 200mS and overshoots. I thought of three ‘solutions’:

i) The local player continues to move until the next update is sent, even though you’ve let go of the key. This is only really noticeable when you try to stop at a precise location.
ii) The local player stops immediately, the remote player jumps to the correct location on the next update. This was noticeable even in my FPS. Players tended to lurch.
iii) The local player stops immediately, the remote player has it’s velocity corrected to close on the correct location over a short period of time. This helps smooth out the lurches. However on stopping players tend to halt, then take a few steps backwards.

Currently I’m using a combination of i) and iii). I found I still needed iii) due to variable network latancy causing errors resulting in stuttering movement. Sometimes you get that stop and step back effect though. I’m also synchronising the client clocks and introducing 60mS forced lag, which keeps everything very close to in-sync without having to go for a lock-step approach. Incidentally I’m doing all my collision detection 60ms in the past as well and then feeding the correction back to the separately modelled current position.

However I guess that i) doesn’t really hack it for 2 player as you need to change direction really fast to go down those narrow alleyways. Perhaps you could additionally send key-up key-down events across the network as they arise. If this was implemented in conjunction with around 60ms forced lag, it might be acceptable. You still need to send regular position updates to cater for when the player movement is effected by collisions.

Maybe first try implementing 60ms lag in your single player game and see if you can live with it.

Alan

Jeff · August 15, 2005, 7:15pm

Simplest solution I see would be just to send an “off” packet.

When the user relases the key send one more packet with final position and velocity 0.

Should solve your problem, yes?

kevglass · August 15, 2005, 7:45pm

Thats what I’ve basically got working at the moment, i) and iii) and I see the same things, the remove player gets to their location then rolls back a bit. Less noticeable I think in a FPS, but for a game of my sort just looks terrible.

I’m going to move to a lock-step, 1500 archers type system.

Kev

Orangy_Tang · August 15, 2005, 8:28pm

If you’re gameplay suits it (or, in my case, you’re willing to make a few gameplay compromises) lockstep can work rather well. Particularly for hard to predict cases (typically tight interaction between multiple people). FPS’ get away with a lot because the physical movement of the bullets gives you a perfect oppertunity to hide/fake (such as displaying blood splats from ‘hits’ but not actually taking any health off).

What I’m aiming for is a full Gigawing/Ikaruga bullet hell shooter but networked. Given that this requires 100+ moving bullets and pixel-perfect collision there’s no way a normal client-server would ever work, even with full prediction. Instead i’m loosening up the lockstep system (much like the AoE method) and instead having the client’s control each ship directly (which is more akin to a distributed simulation).

As with most distributed simulation this comes with one big drawback: cheating is now trivially easy as the client is completely trusted for reporting their own collisions. However I’m willing to accept this as:

Since the game will be co-op, cheating against an ‘opponent’ is silly and infantile
For high scores etc. I’ll be recording input and playing it back to give a (largely) foolproof method of validation. Given that I’ll already need a deterministic simulation this is practically for-free.

Jeff · August 16, 2005, 2:20am

Hmm. I suppose another solution might be to lag your motion slightly in time… enough to cover that end shift? Another solution often used to cover this is to have a small amount of "acceleration/decelleration’ in motion such that you dont have those hard edge cases at start and stop. If you look at GetAMped they do this, I think. They also have a cute “skid” that happens ocassioanlly. My guess is that they are doing soem kind of send/ack thing for motion and when the ack doesnt come abck fast enoyhugh, they skid to hide it.

Lockstep has it own set of problems. Cheif among them are that (a) everyone’s performance is only as good the worst performing player and (b) varyinbg latency can make resposiveness uneven and © any bad latency spikes stall the game.

You can cover (b) at the cost of a constant control lag by latency-buffering . The others are pretty much hard artifacts of the technique.

Anyway do what works

Jeff · August 16, 2005, 2:22am

Hybrids always interest me

When you get it working Id lvove to see a detailed explaination!

kevglass · August 17, 2005, 12:21pm

I’m starting to consider use cases for this. I’m assuming that I’m only setting commands for some step in the future.

Do these sound right?

I’m going to support late game entry by pausing the game at a specified frame, transferring state from one of the clients (it shouldn’t matter which) to the new client then let the game continue to run.
I was considering a standard evil type monster. If I send the command [zombie onto player X] and make that the monster move towards the player until in combat range and once in range attack, since my simulations are synhcronised I can assume that the zombie reaction to players will be the same on every machine. Hence the simulation stay synchronous even though some processing is being done all the client? (assuming of course I have a synchronised source of random numbers)

Kev

blahblahblahh · August 17, 2005, 1:44pm

Do it in parallel, split between the clients. Remember that upload b/w is these days 4-16 times less than d/l - so the transfer from client-to-client will go at snail’s pace. Just carve up the data into e.g. “4 clients, each gets 1/4 of the map to send to the new player”. Will go much quicker.

Yes. With caveats…

For instance, you are OK not to have all random data identical - what you want is that code which needs to share the same seeded deterministic string does share it, and code that doesn’t does not. e.g. the “random bird sounds” mentioned in 1500 archers - no reason for that to be taking out of the main random stream. i.e. you want to do it OOP

The real problem is when you have code that executes a random() call or not depedning upon the output of a random event, and that random() call is a critical one. “random event” can be as simple as a GFX card driver bug causing a section of code in your game to execute slower than it should do, inducing randomness.

So, for instance, multiple threads == death by a thousand painful cuts. You ahve to ensure there is no such thing as a random event in your engine except for your actual output from calls to the random() (or, in java, nextInt() etc) method. This requires care: even something as innocuous as using a GUI rendering thread to service mouse-clicks in parallel with running paint logic is violating that. The pain is tracking down all these sources and finding out how they managed to (eventually) influence a call to random() to be later or not at all on one client compared to the others.

kevglass · August 17, 2005, 2:00pm

So what exactly is the caveat there? Assuming there is a common source of random data then there isn’t a problem?

There is also no way I’d ever have driven the simulation loop based on an external influence.

Maybe I’m missing something in what you said, but seem like you states the blatently obvious?

Kev

vrm · August 17, 2005, 2:10pm

why not synch randmizer seeds around the clients ?
I agree with Adam, I thinks it’s quite hard to achieve that with multi-thread.

Orangy_Tang · August 17, 2005, 2:25pm

That should be possible, but sounds quite tricky. I suppose it depends how much you’ve got going on in your game world. One possible snag - random numbers (again). Obviously you can’t just share the original seeds with the new client. Easiest I suspect would be to re-seed everyone’s random number generators with some newly-agreed values.

(Unless the Random() class is serialisable or similar?)

kevglass · August 17, 2005, 3:36pm

Yes, that was the original assumption - there would be some common source of random numbers.

ooooh… good point. Thanks.

Kev

Jeff · August 17, 2005, 6:21pm

Hmm. Maybe Immissing a few things here…

If youa rent tightrly synchronizing the position of all players, then wont the Zombie AI potentially be afced with different situatiosn on different machines:

eg on machien one its a direct line to the player. On machien 2 we need to go around an obstacle…

kevglass · August 17, 2005, 6:28pm

I think I can read that

Na, the thing you’re misisng is that I am intended to do synchonised simulations. I must have forgotten that ever so important detail. I’ve never done it before and it seemed like it might be quite interesting to work with.

Kev

Jeff · August 17, 2005, 6:37pm

Okay. Im a bit confused. I thought you said you were releasing the restrictions on local movement to get responsiveness for the player…

If this is true lock-step then I agree its a non-issue.

kevglass · August 17, 2005, 6:53pm

I’m getting a little worried about the term lock-step actually. I can’t find a good description anywhere through google so I’m just going to have assume its what I’m doing

Client sends command to move Bob to position X
Server recieves command - timestamps it to be scheduled some configurable time in the future
Server forwards the command on to everyone connected
Clients (including the original one) recieve the command and add it to their scheduled list
As a client’s game progresses (in a deterministic manner (1)) they process commands recieved from the server which modifies the out come of the game.

For my personal game:

This means the player’s actions will be delayed slightly, but I’m turning it into a point and click dungeon crawl so it probably doesn’t matter too much (ala Diablo?)
If a client recieves a command which is in their game’s past, IMO - they’ve got ahead of time, now invalid. Either resynch with other clients or kill the client off.
If a client recieves a command which is in their game’s future but not very far maybe we’ll slow the running the game down a bit.

So the clients might be seeing different times in the simulation (by a bit) but they won’t actually see different simulations at all.

Is this lock step?

Will this work? Do I need to send constant update messages to keep the games from running off into the future.

Kev

(1) In a deterministic manner - common seeded random number generator, strictfp, common rules governing everything.

Orangy_Tang · August 17, 2005, 7:26pm

Looks about right to me. Although you’ll need to do frame-based rather than animation based simulation (to get it deterministic) so instead of timestamps you’ll be counting frames.

A ‘pure’ lockstep you basically stall until you’ve got the inputs from all players for each frame. Obviously thats not practical over the internet, so you just add a buffer (say, 10 frames long). Raw controller data gets sent straight to other players and buffered. If your buffer is big enough you’ve always got everyones data ready for each frame. If not you stall until you get the next lot of input. Because you’re waiting for the other player’s input you don’t have to explicitly slow down or speed up any of the simulations as it keeps itself in check.

Data from the past shouldn’t happen - the only possiblility would be that it’s a duplicate of data you’ve already got. If it’s data you havn’t got then you’ll be waiting for it (and it’ll be ‘present’ data). Data from the future going in the appropriate place in the buffer for use in a few frames or so.

AoE’s method basically says instead of syncing per-frame you sync per ‘turn’ which is game-defined. They’re only buffering one turn ahead, because ideally a turn is long enough not to cause a stall. Game logic within the turns stays nice and deterministic. Ideally you adjust the turn length based on the connection quality so you’re never stalling but at the same time keeping the response nice and snappy.