How do I diagram ping between servers on the Net?

blahblahblahh · February 25, 2004, 1:40pm

This is what I was talking about… you don’t appear to know the fundamental technology problems. Note that I say “appear” because at the moment I’m only guessing at what your knowledge and experience is of dist sys; I may have guessed wrongly.

A dist net is fundamentally incredibly unreliable.

To say that a dist net is “more reliable” suggests you have a high-level awareness of a single feature called failover. Failover is really unexciting and trivial, and can make a cluster appear to have the conjoined uptime of all it’s members. It doesn’t change the fact that reliability between members is poor, and it only works if you have lots of difficult and complex stuff going on in the background to enable it to work, or if you have a game where each server is a shard, and runs as a completely separate “game”; even then it’s often still difficult because real-time replication/distributed-mirroring is non-trivial (mainly because it’s difficult to make sure it goes fast enough).

For instance, the simple question of altering game-data. How do you ensure there are no conflicts when two servers want to write to the same data? If you make each server authoritative for it’s own data only, how do you make sure the other servers can read that data?

If you think your multicast system will solve this, you probably haven’t done the calculations for your bandwidth usage, latency inflation, and above all the per-server storage needs and unmarshalling processing needs (iterating over the locally stored game state implementing changes encoded in incoming packets).

mthornton · February 25, 2004, 2:19pm

Perhaps make these synchronization problems a feature of the game — just like the space time distortions Enterprise is encountering every week. Hey maybe that is how they came up with those plots — someone dropped the original scripts on the floor and picked them up out of order.

blahblahblahh · February 25, 2004, 2:51pm

[quote]Perhaps make these synchronization problems a feature of the game
[/quote]
IMHO this is the best solution of all for such problems - cheapest, least risky, most effective. But you can’t do it effectively until you have a really good and comprehensive understanding of all the problems :(.

jc1 · February 25, 2004, 4:34pm

What if each server is responsible for its own game data? For example, if a client creates an account on Server 1, then the client is allowed to create a certain number of objects with his account on Server 1. Server 1, which is authoritatively responsible for the client’s objects, periodically sends multicasts of the client’s objects to every other server on the distributed network.

If Server 2’s client affects Server 1’s client’s objects, then Server 1 authenticates any actions to the objects. If Server 1’s client cancels an account, then Server 1 destroys the client’s objects. Synchronization should not be a problem. However, if latency causes two objects to overlap, then Mark Thornton’s solution is necessary.

An inexpensive solution to latency does not exist. If it did, then “rack-mounted Linux-based IBM eServer xSeries systems hosted by IBM and running on internal fiber-optic networks” would not be Butterfly.net’s solution to latency. Hardware and bandwidth are expensive. I simply want to develop the world’s cheapest, but most scalable, MMOG.

mthornton · February 25, 2004, 5:07pm

Each server is reponsible for a planet (town), the population of each planet (# of clients) is restricted to the servers capacity. Information flow between planets (towns) is limited by the speed of light (horse and cart). So latency need only be less than the model time it takes for information in the model universe to travel between places. So latency between servers ceases to be a problem. Doesn’t help with latency between a client and a server.
Thus if a user in the UK wants to play on Mars and the Mars server is in Taiwan, it won’t be much fun.

jc1 · February 25, 2004, 5:57pm

Unfortunately, your solution means that each server can host only a portion of the world–which completely contradicts the notion of scalability via donors who donate servers to the distributed network. Yes, each server should host only a limited number of clients, but a truly scalable MMOG should allow as many people as possible to assemble anywhere on the map. Thus, all servers must host the same map of the whole world so that each server’s clients can be anywhere on the map.

The solution that I outlined in my previous post is incredibly crude, but I cannot think a cheaper way to implement a seamless, yet scalable, world without a hideous amount of expensive hardware.

A MMOG should be a window into another reality, and to artificially limit the number of clients who can assemble at a single location on the map is totally unrealistic. Expensive hardware can solve this problem with brute force, but if a MMOG is too expensive for a university to host, then nobody will play it. If a university can host a MUD, then a MMOG that scales linearly with the university’s budget is the next logical step.

blahblahblahh · February 25, 2004, 5:59pm

How do you know they affect each other? This is usually a hard problem you won’t notice until a little further on in your planning. Here you’ve glossed over it.

Assuming you can fix that, you go on to mention your multicast again. How do you know you have the available bandwidth and latency to afford the amount of multicast traffic you’ll generate without it overloading (and breaking the rest of the system!) or bottlenecking?

I wish there were a single resource I could refer you to (e.g. a good book) that explained all this in detail so you could just read it up and end up with the wisdom of decades. This is theoretically possible - the knowledge exists and people have learnt how to describe it effectively - but I haven’t yet seen a book that covers the issues for games :(.

Butterfly does not provide a solution to latency. Bearing in mind my huge prejudice against butterfly, my opinion is that they are using the wrong tool to solve the problems of MMOG’s, and any games developer that is unfortunate enough to buy their solution will soon find this out. Grid computing is a solution that has desperately been seeking a problem to solve for many years (hence IBM’s backing of BF - they need more problems for all their existing research there to solve). IMHO it’s inappropriate for the majority of games, and BF has a good business model (for them: it makes them much money and power) but which screws the games developers (wrong tool coupled with some nasty long-term side-effects).

But as MT points out, latency is a separate issue entirely in each of several different parts of the system. Which one were you referring to? Do you know all the different places, and understand the importance of each to the success of the system?

mthornton · February 25, 2004, 7:09pm

I wasn’t being entirely serious. However out in the real world most locations do have capacity limits — there are only so many students who can fit into a Volkswagen Beetle! The International Space Station is currently limited to two (or five during mission change over).

mthornton · February 25, 2004, 7:36pm

Suppose your virtual world tries to simulate a visit to Highbury Stadium (capacity ~39000) to watch a football match. Each server could manage perhaps a block of seats. There will be no problem maneging the irritating lard mountain in front of you who keeps blocking uour view, or the actions of other in the immediate vicinity. However every so often someone tries to start a Mexican wave or sing some song. Due to latency effects this will propagate rather erratically around the stadium. One block of seats may not even be singing the same song (unless the announcer is attempting to ‘conduct’ proceedings, but that doesn’t always work).
In a system such as this the information from other servers could be late, inaccurate or only a rough approximation of activity elsewhere, but it wouldn’t actually matter. You could subdivide the work down to the level of a server per seat block, but beyond this users may start to notice the seams.

jc1 · February 25, 2004, 8:18pm

I love this place. Any attempt to gloss over a mediocre explanation is glaringly obvious, and the experts will detect it instantly. The experts are, like, debuggers.

If a server periodically sends multicasts of the location/state of its clients to every other server, then the server also receives the location/state of every other server’s clients. A server ignores packets that have no effect on the server’s clients.

For example, assume that all of a server’s clients are located in a castle on the map. The server receives multicast packets of other servers’ clients. However, the server analyzes each packet’s header and determines that only a few packets describe events that occurred near the server’s clients. The server accepts those packets that describe events that occurred near the castle on the map, but rejects everything else. (The existence of the castle is irrelevant. I merely mention it as a point of reference.)

I mention multicasting because the multicasting is supposed to reduce the amount of required bandwidth. If 64 servers exist on a network, then a server sends out a multicast packet to 63 other servers only once. Otherwise, a server sends out a unicast packet to 63 other servers, 63 times.

I cannot control bottlenecks on other people’s routers, so this is the best I can do. I have absolutely no idea on how to reduce the amount of required bandwidth further.

BTW, I can settle for a late, inaccurate, or rough approximation of activity from other servers. The objective is to cram as many people into a map as possible, as cheaply as possible, without zones or shards.

Jeff · February 25, 2004, 10:05pm

You do realize that every every server is broadcasting all its traffic to every other server its the equivalent of every server getting blasted with all the traffic?

Whether or nto this busy you anything really then comes down to whether the processing that creates/responds to that traffic is very heavy or not comapred to the cost of handling the packets.

If it is then such a scheme may help you increase the number of players you can handle. If the handling the traffic is teh bottleneck though then you will not be much better off then one sevre handlign everything.

(In fact, sinxe it takes more processing to do net packetr handling then to communicate between processes i na common memory space you may even be worse off.)

Also, have you considered what happens to your total simulation if some of the nodes fail?

Welcome to the complexities of multi-processing for massive scalability.

Jeff · February 25, 2004, 10:06pm

Btw…

Multicast reduces the bandwidt hrequriemenst on the wire, but the wire usually isnt your bottelneck. It doiesnt reduce your load on your machines handling the traffic theyr evieve, in fact it increases it over point to point communication.

blahblahblahh · February 25, 2004, 10:25pm

PS what net protocol are you intending to use? IP doesn’t support multicast until v6 (which is still being used by almost nobody outside of universities and research depts :().

jc1 · February 26, 2004, 4:41am

If some of the nodes fail, then the total simulation will continue. However, clients on the failed servers will appear to the distributed network’s servers’ clients to have quit and logged off the distributed network. An attack on a server will not affect a distributed network.

I originally intended to implement the Java Reliable Multicast Service. However, I now suspect that the only two solutions to a seamless, scalable world exist: more hardware and bandwidth, or less servers and clients.

I am out of ideas. I thought that the principles of a distributed IRC network could apply to a MMOG, but I was wrong. Apparently, clever coding is not a substitute for more hardware and bandwidth.

What if I develop a distributed IRC network and, for the client, a 3D graphical front-end? If a front-end allows a client to move around, gesture, and walk from room to room, is the system robust enough to support a MMOG? After all, Efnet has over 100 servers and more than 20,000 users!

mthornton · February 26, 2004, 5:46am

[quote]PS what net protocol are you intending to use? IP doesn’t support multicast until v6 (which is still being used by almost nobody outside of universities and research depts :().
[/quote]
Multicast seems to work with class D (IP4) addresses. Getting a publicly usable address and getting ISPs to propagate the broadcasts may be another matter.

blahblahblahh · February 26, 2004, 8:47am

Tis what I meant … in practice, IPv4 has no multicast. In practice, IPv6 should have it (although I’ve not checked the current status of in-the-wild IPv6 nets for quite some time, so YMMV)

blahblahblahh · February 26, 2004, 8:53am

That’s a silly conclusion (and incorrect). All we’ve said is that naive algorithms are not going to solve the problem, and that you have to put in a great deal more work and “clever coding” Sorry, that’s just the way it is. Realistically, if you can come along with a trivial architecture and solve fundamentally hard problems that for 30+ years no-one else has solved without months and months of design work then maybe you’re a genius :).

Did you read the article I quoted? Every prospective customer that comes to us we ask to see a Game-Concept before we talk to them - it’s rarely possible to give meaningful answers to any specific questions unless there is a game design already on the table.

Your question above is a classic example. If your game design is “IRC with avatars, no more than 5 avatars to a room, 10 second delay to move from room to room” then yes of course the answer is yes.

If your game design is for a standard MMORPG, then of course the answer is “snowball’s chance in hell”.

cknoll · February 26, 2004, 1:58pm

I wrote an applicaiton that sent information using multicast, and this was in 1994, so IPv4 indeed does have multicast even in practice. I think that if you have control over the servers and the network that joins them (and most of the big hitters in the MMO arena do have this) multicast is definitely an option, considering you will be able to configure the routers that interconnect the servers to forward multicast packets. In my experience with multicast, it is EXTREMELY scalable in the sense that if you need to notify 100 servers of 1k of state change, you send 1k of data, and not 100k (1 per sever). And it remains 1k even as you move up to 1000 servers.

It has always been my interest to implement a ‘distributed simulator’ that scaled well to multiple clients, but I’ll keep these ideas to myself because I don’t think I could handle blahbhalbhalbhaa’s cynicism on it. heh.

Jeff:

I don’t agree with this assessment, but I didn’t work at TEN.

-Chris

blahblahblahh · February 26, 2004, 2:59pm

FYI I’m not cynical here, it’s just that I see quite a lot of people (especially in the mainstream games industry) embarking on MMOG projects without the vaguest idea of how incredibly huge a topic distributed-systems dev is. Many people assume that dist-sys server programming is not much different from network programming, which is like assuming that doing a modern OpenGL 3D engine with all the eye-candy is not much different from rendering 2D in standard VGA (which you know every grahpics card handles equally well, for a start!).

It depends very much on application (which seems to be something I’m repeating myself on :)).

In any complex distributed system, you have about 5 (can be many more) major simultaneous bottlenecks and every time you improve one by a factor of e.g. 2, your performance improves by a factor of 1.1 because you’ve just become limited by a different one of the five. You can play whack-a-mole like this for a loooooong time without showing much improvement.

So, bw could be your current bottleneck, and then you multicast and it disappears, but there’s no improvement because it was masking not just one but four other major ones. One part of the problem is that people often forget how much the system is holistic, i.e. normally you’d think that handling packets was a tiny amount of CPU power, but in a dist system it could be enough to push you over the edge into a chain reaction where throughput is slightly less than incoming requests - after 1 second, everything’s a little slow, after 10 it’s getting very slow, after 1 minute the server is fatally overloaded.

jc1 · March 1, 2004, 11:48am

NAME OF GAME
CubeWorld is the name of our MMOG.

DESIGN HISTORY
CubeWorld is the successor to strikeNET, our attempt of yesteryear to design a MMOG. strikeNET underwent a plethora of revisions, from a MMOFPS in the vein of PlanetSide, to a MMORPG that combined elements of Grand Theft Auto: Vice City, Splinter Cell: Pandora Tomorrow, and Hitman: Contracts into a simulation of the world of espionage.

strikeNET was an attempt to appeal to the mainstream, but we realized that no game could appeal to everyone. All games restrict freedom through goals and boundaries. strikeNET was to be a MMOG to end all MMOGs. strikeNET promised a world in which users could do anything, without bounds.

Although we wanted that world to be one of espionage in the current Cold War, we knew that strikeNET lacked the ambition to deliver a world with no rules. We wanted a sandbox in which thousands could collaborate to create their own worlds in a single, massive universe. CubeWorld was a result of our odium for conventional MMOGs.

GAME OVERVIEW
A multi-user domain, or MUD, is a server that hosts a web of rooms in which clients interact with clients, items, and rooms. Clients can create items, rooms, and links to rooms. Interaction between rooms is impossible because a MUD is zoned, not seamless. Furthermore, items in a MUD serve no function.

CubeWorld is a sandbox in which thousands can collaborate to create their own worlds in a single, massive universe. CubeWorld allows you to build whatever, whenever, wherever. You can script anything, from a battlefield in which soldiers fight for power, for glory, forever, to a city in which anyone can be an assassin.

FEATURE SET
CubeWorld stresses simplicity, to the point that CubeWorld resembles Quake III: Arena. In Quake III: Arena, a master server maintains a list of servers and ping. Each client queries the master server to join a server. Each server maintains its clients, items, and buildings. If a server disappears, then its clients, items, and buildings disappear.

In CubeWorld, a master server maintains a list of servers and ping. A client can query the master server for a list of servers and ping. However, she maintains an account on her favorite server only. Her account maintains her avatar, items, and buildings, which persist after she logs off.

CubeWorld comprises a distributed network. Donors can donate servers to the distributed network. Each server adds avatars, items, and buildings to the universe that exists on the distributed network. If a server disappears, then its avatars, items, and buildings disappear from the universe that exists on the distributed network.

Each server sends multicast packets of avatars, items, and buildings to every other server. Each packet includes a header that declares where in the universe the packet occurred. If a server analyzes a packet header and discovers that the packet does not affect its avatars, items, or buildings, then the server ignores the rest of the packet.

In CubeWorld, your avatars, items, and buildings are easy to edit. If a model is animated, then it consists of geometric primitives around the bones of a skeletal animation. If a model is unanimated, then it consists of geometric primitives. Each primitive is a solid color instead of a texture.

CubeWorld is friendly to modems. A user can download a client in no time, because the distributed network streams all models and sounds to the client. A model in the form of geometric primitives around the bones of a skeletal animation requires little bandwidth. A sound in the form of Ogg Vorbis also requires little bandwidth.

THE GAME WORLD
In CubeWorld, the universe is a massive, seamless grid. Clients can cordon off plots of land on the grid to create buildings with geometric primitives. A group of clients can share access to a plot of land, which allows collaboration in real-time. Clients can also use a scripting language, similar to UnrealScript, to bring their plots of land to life.