Suggestion: psuedo-reliable (reliable-UDP) channel

chumDS · May 4, 2007, 5:12am

A lot of MMO games use UDP for their main protocol, as it’s faster and doesn’t have many of the hangups of TCP/IP.

The problem is: for “important info” (like chat), it’s unreliable.

Several people have implemented various “reliable UDP” protocols, basically confirm-or-resend type things that help make sure that an otherwise-unreliable packet arrives at the destination EVENTUALLY.

Rather than make every MMO-author using SGS re-create this, it’d be cool if the SGS API had it as an option. Basically: out of order, but otherwise reliable data transmission.

Since time & resources are tight, it’d “work” if there was at least an “SGS-approved” API, and it used some sort of open/shared-source implementation that could be updated “later.”

You know: for some value of “later”

Thanks!

Jeff · May 4, 2007, 4:13pm

NO. Many games use UDP because “common wisdom” is that its “faster.”

In fact thats a gross over-statement and sometimes just plain wrong, depending on the circumstances. Beware common wisdom, because it usually isn’t,

What do you mean by reliable? Do you mean gauranteed delivery (within the limits of computer and the net)? ordered delivery or both?

The API provides the ability to request unordered/reliable which I think is the most sensible way to understand what you are asking for. It is up to the protocol handler however to decide how to implement that request.

If you don’t like how the default one does it, you can write your own.

chumDS · May 5, 2007, 7:52pm

Uh… where are YOU doing y our field tests?! My experience is that, indeed, UDP is faster. I can’t remember ever seeing a performance test where TCP was faster than UDP, and only under very constrained and idealized circumstances have I seen it be “close enough” to call “as fast.”

I don’t want to argue “foo is better than bar”-wars, but I’d love to see data that indicates otherwise.

Hmmm, maybe a better way to approach this would’ve been as a set of questions:

Under what circumstances does the SGS team recommend using “reliable”, “unreliable” or “unordered reliable” channels?

(Please don’t say “if you need reliable communications, unreliable communications or reliable-but-unordered communications”, or we’ll have to send you to work at MS-tech-support ;))

What I mean is: in the context of a multi-player game (specifically an MMO-type), what are their intended purposes?

Do y’all have any field-test performance data (i.e., “real 'net”, as opposed to LAN) on the various channel types? (…That you can share with us, I mean :))
I read “reliable channel” as “TCP” and “unreliable channel” as UDP. (a) Is that right? (b) What does “unordered reliable” represent? Is that what some people refer to as “a reliable protocol over UDP”?

Thanks!

dormando · May 5, 2007, 9:52pm

I don’t have links to any hard data offhand, but there’s plenty of anecdotal from experience and what other people have done:

TCP handles backoff, clogs, inherently. The idea behind old “UDP packets for unimportant data” was that even if the client’s bandwidth could not accept the packets fast enough and would drop them, a few would still get through not in sequence. Then the client fills in the holes and displays the data. This ends up being a huge waste of bandwidth. You really just want to let TCP “figure out” how fast packets can be reliably sent to the client, figure out the proper window scaling, packet size, and all that. Then on the server side you can easily (now, anyway) figure out how well the client’s doing on sucking down your data.

On a higher level example, you internally have “reliable” and “unreliable” messages. Reliable would be chat, unreliable would be a movement update in whatever form. If writing to the client socket with your message produces an E_WOULDBLOCK (or similar), you may make a decision to drop the ‘unreliable’ message in favor of getting the ‘reliable’ messages to the client. At that point you can also make a decision to internally change the rate at which you send unreliable packets to the client. With slower client links this can have a significant impact on banwidth savings, and you’re less likely to flood your client offline with bad UDP packets.

TCP will be in sequence. Interally drop your unreliable messages that wouldn’t fit down the pipe and make sure what does go through is compacted and in order. Will make your life easier as a client author.
TCP window scaling! Again, I like to reiterate this Once your connection’s been running for a handful of packets, further communication can get a lot smoother. With UDP it’ll be bad forever.
TCP is much easier to deal with for crappy connections and firewalls. Pick one port to one host. Not a huge range of UDP ports, and you don’t have to pray it’ll get through that bizarre NAT’s random ISPs puts all their customers behind.

An issue I see is folks going too far in either direction. Either they do TCP with no sense of congestion control (Older ragnarok online clients; maybe all of them) or they spew UDP and need multiple class C’s to deal with the packet traffic (FFXI). I didn’t look into WoW too closely, but it looks like they do it correctly. One TCP connection, unreliable internal packets.

Hoyle · May 5, 2007, 11:03pm

What Dormando says is good to point out.

Basically TCP is well supported and very good at doing what it does. A lot of people see UDP as an easy way to decrease latency, and from what I’ve seen they often start adding “features” as they try to make it behave closer and closer to how TCP works. I’ve seen a few projects that tried to use “reliable” UDP and over time ended up being switch back to TCP.

Debating the merits of TCP vs UDP though throws up a warning flag to me. It seems a case of premature optimization. From what I understand, the underlying network protocol can be swapped within Darkstar. So assuming we can swap this out at a future date, then deciding to switch to UPD from TCP isn’t really necessary. I think premature optimization is a really common problem among developers and energy is much better spent on good design.

Wait till it works before you optimize.

Mr_Light · May 6, 2007, 1:06am

I beleave what jeff said was that implementation is abstracted away. program against reliable / ordered.

the plugged in program handler handels the implementation it’s decoupled. see it as List someList = new ArrayList(); simply always use array list if your doing your quality assurance tests and they fail and the profiler/benchmarking shows that is non performant and that LinkedList solves it use that. I wouldn’t ,… shouldn’t worry about that till far later. It’s decoupled, you can replace it later as nessasy and it doesn’t involve guessing.

programmers paradise.

Jeff · May 6, 2007, 3:33am

TCP and UDP use the same underlying internet transport mechanism. Under good communication circumstances where you are not getting packet loss and with enough bandwidth to cover what you are transmitting plus your packet overhead, timing by definition MUST be identical. If you don’t believe me, go buy a textbook.

At TEN (Total Entertainment Network) we did solely TCP/IP because its actually lower bandwith over PPP, which was our last critical mile for most of our customers. (Most customers were dial-up back then.)

We had the best QUAKE2 play on the internet and ran DukeNukem3D and Nascar Online Racing flawlessly for anyone with a decent connection.

Many people do not understand TCP and misuse it. In these “timing tests” you speak of, did you disable nagle’s algorithym? The problem with a benchmark generally is that, unless you are an expert on the thing you think your measuring, its likely the results don’t mean what you think they do.

UDP becomes worse then TCP as soon as you need TCPs gaurantees. You end up having to reinvent TCP, but in a way that the net does not understand and can’t therefor help you optimize.

Heh, well I wasn’t going to say that, but I was going to say something equivalent… always ask for the lowest level of guarantees that you need for your app. In general the lower the guarantees you ask more, the more room the underlying protocol has to optimize things.

Totally depends on how you design your MMO.

Guildwars is 100% TCP/IP and runs fine because it was designed for that. (If you dont believe me, fire it up and do a netstat.) Often, MMO desigenrs go for a mix where they seperate data that is critical or needs to be ordered from data that doesn’t and use seperate communication paths for each. The gain for the added complexity is not clear, but it makes some developers more comfortable.

No we really havent gotten to the point of stress testing the comm layer in a scientific way. Among other things, until we finish the production (multi-node) servers there are parts of the equation that shoudl be included that we can’t measure.

This is up to the protocol handler, which ultimately means its up to you.

Our default protocol handler does “unordered/unreliable” over UDP and anything with gaurantees over TCP. However, it would be incorrect for you to start building a client-level reliability scheme ontop of our unordered/unreliable. Rather, if you want a different mapping such as an actual UDP based “unordered/reliable”, you should do it through a custom protocol handler.

My pleasure.