UDP Vs TCP/IP

I’m genuinely interested to learn some of these TCP optimisations if they are as great as you say, but what we tried didn’t work for us, can you give us some technical explanations rather than ‘use google’ ?

Probably the most import is turning off the Nagles algorithm, which is designed to gain bandwidth at the expense of latency

Socket.setTcpNoDelay(true); // TCP_NODELAY

There are other things to work on, but the time difference you were seeing doesn’t make sense considering that you should get almost no packet loss since your on that switch. Of course your packets will still be a bit bigger with TCP (more header info), but with no packet loss you should get nearly the same latency from TCP and UDP. If you are not within a millisecond or two something is very wrong.

[quote]Probably the most import is turning off the Nagles algorithm, which is designed to gain bandwidth at the expense of latency

Socket.setTcpNoDelay(true); // TCP_NODELAY
[/quote]
yup, thats what we thought too, but we still saw huge latency :frowning:

[quote]Of course your packets will still be a bit bigger with TCP (more header info)
[/quote]
no biggy, it’s not that much more info, and we have bandwidth to spare atm :slight_smile:

[quote]but with no packet loss you should get nearly the same latency from TCP and UDP. If you are not within a millisecond or two something is very wrong.
[/quote]
again, this is what we thought, we tried changing buffer sizes too, they all helped some, but nothing got us anything that we thought ‘hey, we’ve cracked it’, the send / receive buffer and the TcpNoDelay seem to be the only settings we have to play with under java, we also tried a completely remote network to check if that helped, and it didn’t seem to either, kevin went away and tried using NIO channels too, that again helped some, but not enough. So we looked at UDP, and seemed to beat the latency, we drew some conclusions from this that may not be 100% accurate, but seem to explain our situation.

:-[

In the interest of thoroughness I was going to dig out my old test code and do some more playing, but i couldn’t find it, so i rewrote it, and damn if I didn’t get good ping times (avrg 2 millis, worst case about 4 millis, this was with TcpNoDelay set, I’m sure we tried that before, but ho hum, our UDP code uses message objects so kevin knocked together a TCP version of the endpoints and we are still on sensible latency. I guess we did something wrong in out tests, but what ever it was, we both did it independantly, and without the code I can’t check. So, large slice of humble pie, and a bald spot from all the head scratching.

Most confused now

Endolf

Of course the real problem with TCP comes in from packet loss (and the subsequent retransmits and sequencing), which is when you start having to get into really interesting experiments with what works best for your particullar game. :slight_smile:

[quote]I myself like to use millions of flies argument - but I use it mostly to note that it is not important what common people do. But are you suggesting that Carmack/GarageGames/Unreal team etc are just misguided flies ?
[/quote]
There is no doubt that initially most of them had no real idea what they were doing. Quake 1 and 2 both had rubbish network code (and q2 even came late enough, IIRC, to benefit from seeing the PlanetQuake improvements - however, my knowledge of Q2 is very sparse; I didn’t play it much and haven’t studied it (OTOH I’ve spent thousands of hours on Q1 and 3)).

But that was THEN…see below for a note on Q3.

OK, but that’s NOTHING! Quake 3 can cope with 40%-50% packet loss, no probs (well enough, at least, to be able to still play the game effectively, modulo the difficulties/differences that may ALSO be caused by the high ping time that usually goes hand-in-hand with a high packet loss).

[/quote]
Indeed; it is a sufficiently difficult problem with Tron to merit some serious academic study - comparing players’ effectiveness against the same bots on different simulated artificially-delayed systems. E.g. run TRON on a 10 Mb LAN, and run ten sets of tests, from “no additional latency” up to “+100ms latency” or something. This could be compared to the results of similar tests that IIRC have already been performed for generic “twitch” games a la Quake/Doom.

Certainly, tron on TCP is pretty much pointless without the guarantee of LAN-only games. One dropped packet can screw the synchronization (between players differing views of the game-state) for the next 100-3000 milliseconds.

But c.f. my next post below…:slight_smile:

Quick side note for endolf - who by the sound of things was trying sensible things to solve a weird problem:

This may be a red herring, but in future also investigate your hardware. I’ve generally stuck to one or two top-tier branded NIC’s since first discovering how much seriously terrifyingly screwed hardware was around many years ago. Probably not worth naming them because I’ve not re-analysed whose best recently, and now know the quirks so well I don’t need to change. However, I can assure you that even the most expensive NIC’s from the most famous manufacturers have some SERIOUS problems - e.g. I’ve seen 3COM 100Mbit cards that can “accidentally” get stuck at approx 1Mbit (yes, really!); 3COM also in particular had a problem for a long while of people selling things that looked like a 3COM, smelled like a 3COM (!), but wasn’t really - it was a carefully branded clone.

It’s crazy the first time it happens to you, but sometimes you find that a factor-of-a-hundred (or more) performance problem on your LAN can be attributed to one piece of dodgy hardware - from a reputable manufacturer.

As my initial inordinately long post has now kicked up quite a few interesting comments and ideas, I’m weighing back in with a generalised response :).

FYI, I come from a strong academic background (Cambridge University, where being/becoming familiar with all the prior art is heavily encouraged) and now work on the Grexengine (an MMOG technology - grexengine.com), although only tangentially on the networking side (I choose protocols, not implement them). But I’ve never been a pro network engineer, and most of my knowledge is a combination of talking to real net engineers, detailed academic study of the systems and protocols, and my own experiences over 9 years. So you can take what I say with a pinch of salt…

What I’d like people to take home as the core points:

  1. TCP vs UDP: It’s not a simple argument; be careful about making any decisions that will cost you significant effort to implement.

  2. UDP + (some parts of TCP): Is VERY difficult to get right; it’s one of the areas of programming that is still “hard”, as opposed to just being time-consuming to get right.

  3. Of the three options covered in those two points, each is suitable for a significant percentage of games; I say “suitable” not “perfect” - a perfect option is the one that is technologically best for the game, a suitable one is the one that is affordable, not too risky, and provides acceptably good performance.

  4. A depressingly large number of people who offer advice on these topics are naive or ignorant at best (i.e. they have gaps in their knowledge), or just plain wrong at worst. Be careful what advice you take! (and as endolf discovered, it’s often not as simple as just experimenting with the alternatives people suggested and benchmarking them - there are nasty non-obvious subtleties in accidentally mis-implementing many of the protocols).

To illustrate each of these a bit further (some I covered initially):

1:
Well, I guess the length of this discussion already makes that clear enough. Nod to all those who’ve contributed examples and counter-examples and shown how non-trivial a question this really is :). I’d quote more, but this board makes it difficult to quote lots of posts in one, sigh.

2:

Coilcore gave a good explanation of how hard it is to replicate even just some of the excellent work of Jacobson’s (sp?) - and many others. Not by any means impossible, and most of it is documented as research papers - but it wasn’t done by “ordinary” developers: much was done by hard core specialists.

3:

Most of coilcore’s expansions on what I said originally are extra details that I might myself have come up with - we concur strongly.

However, there seems to be one point coilcore doesn’t appreciate: there are games where neither UDP nor TCP - nor both together - “work”. Some absolutely require simultaneous low latency with guaranteed delivery - and e.g. even using the TCP as a control channel to police the UDP (which is a relatively easy “first draft” way of implementing this) is itself not “fast” enough (in the cases I’ve seen, it’s too high latency, because a dropped TCP packet can delay the realisation that a UDP packet got dropped, delaying the resend).

(but note: this is NOT the case for the majority of games - for most games, I agree with coilcore, and my initial arguments for not doing so come to the fore)

For some of those I’ve looked at, separating one stream into TCP-bits and UDP-bits (i.e. sending the data that needed reliabilty down TCP, the rest on UDP, etc) was highly undesirable - but in the overall scheme of things saved enough implementation time and hassle that it made sense financially. Others just HAD to have the best of both worlds. As someone else mentioned, TCP was designed as a generic excellent-average-case protocol; some games sadly pay the price for that (if they use TCP at all).

So, people shouldn’t dismiss “roll your own” (RYO) out of hand - but OTOH, there’s SO MANY people who assume they should RYO in the first place that perhaps I should say as you do, just to combat the huge weight of current opinion, and even the scales! :slight_smile:

As I said before, think several times (lots) before RYO’ing - but don’t completely dismiss it if it looks like it might be necessary. You can probably solve what you thought needed RYO by e.g. using a clever higher-level algo like in Quake3 - but maybe you can’t.

4:

Indeed, a very good point: it’s very important in the games industry to be wary of the fact that most of us are brilliant coders, but only highly experienced in our own specialist areas - and can be quite naive in others.

I’ve read Carmack’s long explanation of Q3’s networking technology, which Brian Hook requested off him and posted to the MUD-DEV mailing list. A search for “Hook, Carmack, Quake3, MUD-DEV” on google should find the post in the MUD-DEV archives. Nicely explained, and a reasonably good approach - although some people consider it a bit more ground-breaking than perhaps is fair.

In summary, by the time id got to Q3, they’d come up with a good algorithm for network-play. As you explained, possibly the biggest problem has not been the technology, but the higher-level algorithms/protocols/etc and the decisions on HOW to use the available tech; Q3’s networking is a good solution for a Q-like game.

[quote]I’ve read Carmack’s long explanation of Q3’s networking technology, which Brian Hook requested off him and posted to the MUD-DEV mailing list. A search for “Hook, Carmack, Quake3, MUD-DEV” on google should find the post in the MUD-DEV archives. Nicely explained, and a reasonably good approach - although some people consider it a bit more ground-breaking than perhaps is fair.
[/quote]
Looked it up an read it. I think the solution is a typical Carmack-solution and reminds of the early published Doom sourcecodes. This is very typical for game coding in contrast to application coding.
The solution is absolutely focused on the current topic, no abstraction, no layering, simple as possible. Maybe that is one of Carmacks biggest talents. KISS - keep it stupid simple.

When I start doing this kind of construction as a not-so-talented game coder, I think of optional clients that only need parts of the information, think of scalability, how to hide networking at all, setting up distributed databases of several kinds, care for minimal bandwidth usage, … I just cannot manage to focus on transmitting the gamestate of a specific FPS. Cannot even develop an imagination what a ‘gamestate’ could be…

Just totally different from a Carmack-approach. I’m afraid my system to just identify things is more complex than the whole Q3 network logic…

But in the end, Q3 networking is highly brute force, isn’t it?

One thing I don’t remember being mentioned in this thread is there is a desired middle state between UDP and TCP that as far as I know doesn’t exists.

UDP is a bunch of atomic datagrams.
TCP is a reliable stream of bytes.

I think many developers want a reliable and atomic protocol but don’t need the stream concept. Something more than UDP but less than TCP. A protocol where packet #5 doesn’t prevent packet #6 from arriving but #5 will be retransimited asap.

Personally I feel that this desire stems for the developer’s desire for a quick/easy solution instead of doing the right thingTM and writing their game protocol such that current updates are not dependant on past state.

… but that’s just my unprofessional opinion.

I feel that reliable and stream are tightly connected terms.

For TCP, if #5 cannot be transmitted for a time, the stream goes to an error state.

But what would be if #5 fails to transmit over and over while #6, #7, #8 are transmitted w/o problem? #1000, #1001, … but #5 still missing. Is this an error? Close connection? Very difficult!

A replacable transmission would be great, where in case of failure and retransmission the data can be replaced by a more current one.

[quote]One thing I don’t remember being mentioned in this thread is there is a desired middle state between UDP and TCP that as far as I know doesn’t exists.
[/quote]
Um, OK, you may need to read between the lines, but several people have been saying that. My first and last posts were fairly explicit in mentioning it. In the first one, I said:

" 4. The vast majority of games developers only have one problem with TCP (but often mistakenly believe they need more!). They need to remove the “in-order arrival”. "

…which is TCP with something taken out (that doesn’t exist in UDP), making it slightly more like UDP - i.e. a middle ground between the two.

From the textbooks:

TCP is…reliable, in-order, flow-control, congestion-control, connection-oriented, socket-addressed, automatic MTU/MSS discovery (i.e. dynamically selects the best size packets to avoid fragmentation)

UDP is…stateless (advantageous in some situations including e.g. can reduce memory usage), socket-addressed, smaller header size

IP is… (TCP and UDP typically both run on IP but don’t have to)…checksums, fragmentation, QOS (sometimes), TTL management.

Some consider that TCP offers too much for it to be ever worth not using it.

Note that TCP/IP is also:

  • ppor at QOS
  • very poor at congestion control
  • very poor at encryption

Other protocols do these better, but some things (like e.g. congestion control) only work well when everyone (on the network) is using them. We could stop most DDOS attacks easily, if only the world were willing to dump IP and switch to ATM. Another example: IPv6 is good at encryption.

TCP is not perfect, and it doesn’t do the kitchen sink - but perhaps it does everything else :).

A very interesting idea. I’ve spoken to some within the games industry who are really pissed off with “the idiots who try to do everything with objects all the time” (they are themselves usually expert OO developers). We at least agree that too many educational establishments teach how to OO without teaching why, and there’s a huge number of people around who have the age-old problem “when all you have is a hammer, everything in the world starts looking like a nail”.

I can give an interesting example from another industry, same problem: someone I know was managing the development of radar systems. There was a team working on the software to draw the display. The old version was C, with little reusability - everything was so tightly integrated that it was hard/impossible to separate and encapsulate functionality. The new version that the new team proudly delivered had a class for each of:

  • the planes
  • the previous positions of each plane (so you could draw trails of arbitrary length)
  • the predicted positions of each plane
  • the graphic for the plane (methods to change colours, change size, rotate, etc)

Unfortunately, on the first beta test, they discovered that it took 30 seconds to draw EACH FRAME of the radar - i.e a frame rate of 0.03 fps. This is because each real-world plane had approximately 30 objects associated with it, and there were a LOT of planes (and the redraw was heavily OO’d, with separate layers drawn separately, so that it took IIRC 5 or 6 passes to draw the whole frame).

Ultimately, you should never use a tool unless you know what it’s meant for; OO was invented to solve a small set of recurring problems - it’s great for solving them, but it has soooooo many disadvantages that it can be as lethal as it can be a lifesaver.

I’m not implying that you in particular shouldn’t use OO, but it’s a sword that cuts both ways. In particular (to come back on topic), when you’re writing network code, you CANNOT speed up the internet, and you probably cannot speed up the hardware (you only control the server hardware, not the client stuff). So, all that’s left for you to squeeze optimizations out of is the software. Performance is typically a heck of a lot more important than whether you can easily add new features later on (one of the primary advantages of OO). Reusability for network code often is no more complex than “is it a single API that I can remember how to use?”. It’s quite hard to write a network-API that isn’t reusable (you have to write pretty hard-core complex algorithms and protocols to make it that way).

Several other parts of game development are similar. I suggest using thingy’s equation (hopefully someone else can remember the name!) that’s the yardstick for VLSI (cpu) developers:

To compare two possible ways of doing a new function (or choosing one of two functions to implement in silicon when you only have room for one), you rank each using the formula:

score = %age speed improvement / %age of the time that the speed improvement can be used.

So simple, yet for instance I wouldn’t bother with abstraction if I were writing a new 3D engine except at a very high level - e.g. I’d want to be able to swap in and out different renderers, and be able to reproduce “sketched quake” (where the renderer is replaced with one that draws everything like pencil sketches; it’s pretty cool) or “fisheye quake” where everything is rendered through a super-wide-angle (“fisheye”) lens.

Many (most?) low-level decisions in game development are 100% all or nothing: if I write a game whose engine uses BSP’s, there’s no point in being able to swap that out and use a non-BSP solution - the use of BSP’s contaminates so much other code in so many places, that you never need a fine grain of replaceability.

These examples are perhaps a little contrived; BSP is a particularly nasty (contaminating) technology that it would be nice to never use at all - but it is very effective and was miles ahead of anything else (for performance) invented for a long while.

While you’re right about OO being heavily misused, I don’t think that it’s anything inherent in OO designs. Take Java 1.1 vs. many of the later APIs for example. With the exception of the AWT, the 1.1 APIs generally consisted of 1 class per function with as little “tying” as possible. This open design allowed each class to perform independently and performance was handled internal to the class. Now take a Java 2 API such as JavaMail. JavaMail has an object to describe every little item in the system, from body text to email addresses to attachments.

In the case of JavaMail it ended up making sense due to the flexibility it gave. But what about Graphics2D? Did we really need objects to describe transforms instead of constants? Especially in performance critical code. The error that I think designers make is that they look for perfect OO instead of pragmatic OO. In thoeretical “perfect” OO, every part of the system should be described by objects down to the slightest detail of using an object instead of a primitive. This works fine, in theory. In reality, you could continue churning out objects ad-infinitum because you’ll never be able to stop deciding what should be described by your system, and what should be taken as an axiom. Thus you end up with object after object being created just to run the simplest of programs.

At the same time, where to draw the line to get the best balance can be a difficult question to answer. Just be aware of what your code is doing and you should be able to deliver both maintainable OO code and high performance code in the same package.

[quote]While you’re right about OO being heavily misused, I don’t think that it’s anything inherent in OO designs.
[/quote]
Yes, there are situations where OO is abused without it being OO’s fault. There are also many many ways in which OO is just plain rubbish and ruinous for a project - you’re onto the right track by saying that “sometimes it can be taken too far”, but it goes much further than that.

OO was invented largely to provide: encapsulation, data-hiding, versioning-independence, explicit-interfaces, ADT’s, etc. Encapsulation is probably the most important of those, depending upon whom you talk to (“an object is defined as Data + the methods that act upon that data”).

E.g.: “Global” methods and data are fundamentally non-OO. There was a time when it was preached that “You should NEVER use global anything - there’s always a better way”. There were good reasons for this - e.g. some compiler-optimizations are impossible if the source contains any globals. But as PC’s get more powerful, compilers can take more time to compile - which enables new transforms which are more “brute force” than the original ones.

So nowadays at least one major reason not to use globals is evaporating; but I’ve worked on at least one game where every method had to have access to every set of data - the problem domain was really hard to solve if you used any encapsulation. Which is a bitch to solve if you’re working in a programming language which really only supports imperative code and OO - and doesn’t also do, e.g., aspects or similar out-of-the-box (IIRC, Aspect-oriented programming was invented largely to circumvent problems like this)

There are mathematical theories around, along the lines of “no programming paradigm will ever be the best paradigm in the majority of situations” - but no-one has yet proved them AFAIAA :). They are based on work to do with search-spaces, and the concept that the ordered set of searches of an algorithm is equivalent to a description of the algorithm. You can probably find more info anywhere that discusses the Church-Turing thesis (that all non-trivial programming languages are 100% equivalent).

(side note: until someone “disproves” C-T, everything can be written in the lambda calculus, which is a programming language with only two operations, and no other symbols - no numbers, no constants, no characters etc - but it can still represent any program you can think of).

…this also happens to underlie most compiler-development theory. But now I’m guilty of taking this thread completely OT…

Aha - so what is better now: TCP or UDP ?

;D ;D ;D ;D

Well, after re-reading the thread and having learnt my lesson, I’d have to recommend carrier pigeon as the most flexible transport. That, possibly in conjuction with the use of two coffee cups and a piece of string.

:smiley:

RFC 1149

i think ;D

I had a very interestig lecture on TCP and UDP today…
You can look at the OH and summary here: (very good documentation imo)

http://www.imit.kth.se/courses/2G1501/common/2g1316-17/lect/end1.html

Aplogies for coming in late.

I assume someone has mentioned turning nagle off?

The problem usually with the “UDP is faster” cotnention is that this is a first blush response which gets most of its “speed” from lack of reliability. However some form of reliability almost always ends up being something the game programmer then has to layer on top of UDP.

At that point you have many of the same problems WITHOUT the saving garce of the infrastructure being pre-tuned for your protocol.

Similarly UDP looks ta first blush to be cheaper ion bandwidth BUT the only palce that bandwidth is really an issue any more is in the last mile of an analog connection. As that is invariable a PPP link, TCP is actually much LOWER bandwidth across the choke point because PPP otpimizes the headers. (8 bytes for a TCP packet across a PPP link, 30 for a UDP packet AIR.)

My suggestion for first blush is to turn nagle and keep-alive off and use TCP. (You may not even need to turn keep-alive off if you aren’t really filling the channel with data.)

If/when you run into individual situations where occasional latency spieks are really hurting you, ask yourself if that part of the communication can be done out-of-band in an unreliable manner.

AND lastly remember that if you are on an analog last mile, then your big spikes are going to be up to 6 seconds losses off communication for modem-retrains and it doesnt matter WHAT your protocol is then.